This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
where the signed power a
equals
a
= |a|p sign a.
120
S. Kring et al.
The covariance between two normal random variables X and Y can be interpreted as the inner product of the space L2 (Ω, A, P). The covariation is the analogue of two α-stable random variables X and Y in the space Lα (Ω, A, P). Unfortunately, Lα (Ω, A, P) is not a Hilbert space and this is why it lacks some of the desirable and strong properties of the covariance. It follows immediately from the definition that the covariation is linear in the first argument. Unfortunately, this statement is not true for the second argument. In the case of α = 2, the covariation equals the covariance. Proposition 1. Let (X, Y ) be joinly symmetric stable random vectors with α > 1. Then for all 1 < p < α, EXY
[X, Y ]α = , E|Y |p ||Y ||α α where ||Y ||α denotes the scale parameter of Y . Proof. For the proof, see [19] . In particular, we apply Proposition 1 in Sect. 4.1 in order to derive an estimator for the dispersion matrix of an α-stable sub-Gaussian distribution. 2.3 α-Stable Sub-Gaussian Random Vectors In general, as pointed out in the last section, α-stable random vectors have a complex dependence structure defined by the spectral measure. Since this measure is very difficult to estimate even in low dimensions, we have to retract to certain subclasses, where the spectral measure becomes simpler. One of these special classes is the multivariate α-stable sub-Gaussian distribution. Definition 6. Let Z be a zero mean Gaussian random vector with variance 2/α covariance matrix Σ and W ∼ Sα/2 ((cos πα , 1, 0) a totally skewed stable 4 ) random variable independent of Z. The random vector √ X = µ + WZ is said to be a sub-Gaussian α-stable random vector. The distribution of X is called multivariate α-stable sub-Gaussian distribution. An α-stable sub-Gaussian random vector inherits its dependence structure from the underlying Gaussian random vector. The matrix Σ is also called the dispersion matrix. The following theorem and proposition show properties of α-stable sub-Gaussian random vectors. We need these properties to derive estimators for the dispersion matrix. Theorem 3. The sub-Gaussian α-stable random vector X with location parameter µ ∈ Rd has the characteristic function
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
1
E(eit X ) = eit µ e−( 2 t Σt)
α/2
121
,
where Σij = EZi Zj , i, j = 1, . . . , d are the covariances of the underlying Gaussian random vector (Z1 , . . . , Zd ) . For α-stable sub-Gaussian random vectors, we do not need the spectral measure in the characteristic functions. This fact simplifies the calculation of the projection functions. Proposition 2. Let X ∈ Rd be an α-stable sub-Gaussian random vector with location parameter µ ∈ Rd and dispersion matrix Σ. Then, for all a ∈ Rd , we have a X ∼ Sα (σ(a), β(a), µ(a)), where (i) σ(a) = ( 12 a Σa)1/2 (ii) β(a) = 0 (iii) µ(a) = a µ. Proof. It is well known that the distribution of a X is determined by its characteristic function. E(exp(it(a X))) = E(exp(i(ta )X))) 1 = exp(ita µ) exp(−| (ta) Σ(ta)|α/2 ) 2 1 = exp(ita µ) exp(−| t2 a Σa|α/2 ) 2 1 1 = exp(−|t|α |( a Σa) 2 |α + ita µ) 2 If we choose σ(a) = ( 12 a Σa)1/2 , β(a) = 0 and µ(a) = a µ, then for all t ∈ R we have + + , πα , E(exp(it(a X))) = exp −σ(a)α |t|α 1 − iβ(a) tan (sign t) + iµ(a)t . 2 In particular, we can calculate the entries of the dispersion matrix directly. Corollary 1. Let X = (X1 , . . . , Xn ) be an α-stable sub-Gaussian random vector with dispersion matrix Σ. Then we obtain (i) σii = 2σ(ei )2 σ 2 (ei +ej )−σ 2 (ei −ej ) (ii) σij = . 2 Since α-stable sub-Gaussian random vectors inherit their dependence structure of the underlying Gaussian vector, we can interpret σii as the quasivariance of the component Xi and σij as the quasi-covariance between Xi and Xj .
122
S. Kring et al.
2 Proof. It follows from Proposition 2 that σ(ei ) = 12 σii . Furthermore, if we set 1 a = ei + ej with i = j, we yield σ(ei + ej ) = ( 2 (σii + 2σij + σjj ))1/2 and for b = ei − ej , we obtain σ(ei − ej ) = ( 12 (σii − 2σij + σjj ))1/2 . Hence, we have
σij =
σ 2 (ei + ej ) − σ 2 (ei − ej ) . 2
Proposition 3. Let X = (X1 , . . . , Xn ) be a zero mean α-stable sub-Gaussian random vector with dispersion matrix Σ. Then it follows [Xi , Xj ]α = 2−α/2 σij σjj
(α−2)/2
.
Proof. For a proof see [19].
3 α-Stable Sub-Gaussian Distributions as Elliptical Distributions Many important properties of α-stable sub-Gaussian distributions with respect to risk management, portfolio optimization, and principal component analysis can be understood very well, if we regard them as elliptical or normal variance mixture distributions. Elliptical distributions are a natural extension of the normal distribution which is a special case of this class. They obtain their name because of the fact that, their densities are constant on ellipsoids. Furthermore, they constitute a kind of ideal environment for standard risk management, see [4]. First, correlation and covariance have a very similar interpretation as in the Gaussian world and describe the dependence structure of risk factors. Second, the Markowitz optimization approach is applicable. Third, value-at-risk is a coherent risk measure. Fourth, they are closed under linear combinations, an important property in terms for portfolio optimization. And finally, in the elliptical world minimizing risk of a portfolio with respect to any coherent risk measures leads to the same optimal portfolio. Empirical investigations have shown that multivariate return data for groups of similar assets often look roughly elliptical and in market risk management the elliptical hypothesis can be justified. Elliptical distributions cannot be applied in credit risk or operational risk, since the hypothesis of elliptical risk factors is found to be rejected. 3.1 Elliptical Distributions and Basic Properties Definition 7. A random vector X = (X1 , . . . , Xd ) has (i) a spherical distribution if, for every orthogonal matrix U ∈ Rd×d , d
U X = X.
0.1
0.1
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
DBK
DCX
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
0
0
−0.02
−0.02
−0.04
−0.04
−0.06
−0.06
−0.08
123
−0.08
−0.1 −0.1 −0.08 −0.06 −0.04 −0.02
0
0.02
0.04
0.06
0.08
−0.1 −0.15
0.1
−0.1
−0.05
BMW
0
0.05
0.1
0.15
0.2
CBK
(a)
(b)
Fig. 3. Bivariate scatterplot of BMW vs. DaimlerChrysler and Commerzbank vs. Deutsche Bank. Depicted are daily log-returns from May 6, 2002 through March 31, 2006
(ii)an elliptical distribution if d
X = µ + AY, where Y is a spherical random variable and A ∈ Rd×K and µ ∈ Rd are a matrix and a vector of constants, respectively. Elliptical distributions are obtained by multivariate affine transformations of spherical distributions. Figure 3a,b depict a bivariate scatterplot of BMW vs. Daimler Chrysler and Commerzbank vs. Deutsche Bank log-returns. Both scatterplots are roughly elliptical contoured 3. Theorem 4. The following statements are equivalent (i) X is spherical. (ii) There exists a function ψ of a scalar variable such that, for all t ∈ Rd ,
φX (t) = E(eit X ) = ψ(t t) = ψ(t21 + ... + t2d ). (iii) For all a ∈ Rd , we have a X = ||a||X1 d
(iv) X can be represented as d
X = RS where S is uniformly distributed on S d−1 = {x ∈ Rd : x x = 1} and R ≥ 0 is a radial random variable independent of S. Proof. See [14] ψ is called the characteristic generator of the spherical distribution and we use the notation X ∈ Sd (ψ).
124
S. Kring et al. d
Corollary 2. Let X be a d-dimensional elliptical distribution with X = µ + AY , where Y is spherical and has the characteristic generator ψ. Then, the characteristic function of X is given by
φX (t) := E(eit X ) = eit µ ψ(t Σt), where Σ = AA . Furthermore, X can be represented by X = µ + RAS, where S is the uniform distribution on S d−1 and R ≥ 0 is a radial random variable. Proof. We notice that
φX (t) = E(eit X ) = E(eit (µ+AY ) ) = eit µ E(ei(A t) Y ) = eit µ ψ((A t) (A t))
= eit µ ψ(t AA t) Since the characteristic function of a random variate determines the distribution, we denote an elliptical distribution by X ∼ Ed (µ, Σ, ψ). Because of
A µ + RAS = µ + cR S, c the representation of the elliptical distribution in (2) is not unique. We call the vector µ the location parameter and Σ the dispersion matrix of an elliptical distribution, since first and second moments of elliptical distributions do not necessarily exist. But if they exist, the location parameter equals the mean and the dispersion matrix equals the covariance matrix up to a scale parameter. In order to have uniqueness for the dispersion matrix, we demand det(Σ) = 1. If we take any affine linear combination of an elliptical random vector, then, this combination remains elliptical with the same characteristic generator ψ. Let X ∼ Ed (µ, Σ, ψ), then it can be shown with similar arguments as in Corollary 2 that BX + b ∼ Ek (Bµ + b, BΣB , ψ) where B ∈ Rk×d and b ∈ Rk . Let X be an elliptical distribution. Then the density f (x), x ∈ Rd , exists and is a function of the quadratic form f (x) = det(Σ)−1/2 g(Q) with Q := (x − µ) Σ −1 (x − µ). g is the density of the spherical distribution Y in definition 7. We call g the density generator of X. As a consequence, since Y has an unimodal density, so is the density of X and clearly, the joint density f is constant on hyperspheres Hc = {x ∈ Rd : Q(x) = c}, c > 0. These hyperspheres Hc are elliptically contoured.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
125
Example 1. An α-stable sub-Gaussian √ random vector is an elliptical ranW Z is spherical, where W ∼ Sα dom vector. The random vector 2/α ((cos πα ) , 1, 0) and Z ∼ N (0, 1) because of 4 √
√ d WZ = U WZ
for any orthogonal matrix. The equation is√true, since Z is rotationally symmetric. Hence any linear combination of W Z is an elliptical random vector. The characteristic function of an α-stable sub-Gaussian random vector is given by
1
E(eit X ) = eit µ e−( 2 t Σt)
α/2
due to Theorem 3. Thus, the characteristic generator of an α-stable subGaussian random vector equals ψsub (s, α) = e−( 2 s) 1
2/α
.
Using the characteristic generator, we can derive directly that an α-stable sub-Gaussian random vector is infinitely divisible, since we have
+ ,α/2 n α/2 1 − 1 s ψsub (s, α) = e−( 2 s) = e 2 n2/α + s + ,,n = ψsub , α . n2/α 3.2 Normal Variance Mixture Distributions Normal variance mixture distributions are a subclass of elliptical distributions. We will see that they inherit their dependence structure from the underlying Gaussian random vector. Important distributions in risk management such as the multivariate t-, generalized hyperbolic, or α-stable sub-Gaussian distribution belong to this class of distributions. Definition 8. The random vector X is said to have a (multivariate) normal variance mixture distribution (NVMD) if X = µ + W 1/2 AZ where (i) Z ∼ Nd (0, Id ); (ii) W ≥ 0 is a non-negative, scalar-valued random variable which is independent of G, and (iii) A ∈ Rd×d and µ ∈ Rd are a matrix of constants, respectively. We call a random variable X with NVMD a normal variance mixture (NVM). We observe that Xw = (X|W = w) ∼ Nd (µ, wΣ), where Σ = AA .
126
S. Kring et al.
We can interpret the distribution of X as a composite distribution. According to the law of W , we take normal random vectors Xw with mean zero and covariance matrix wΣ randomly. In the context of modeling asset returns or risk factor returns with normal variance mixtures, the mixing variable W can be thought of as a shock that arises from new information and influences the volatility of all stocks. √ d √ Since U W Z = W Z for all U ∈ O(d) every normal variance mixture distribution is an elliptical distribution. The distribution F of X is called the mixing law. Normal variance mixture are closed under affine linear combinations, since they are elliptical. This can also be seen directly by √ √ d BX + µ1 = B( W AZ + µ0 ) + µ1 = W BAZ + (Bµ0 + µ1 ) √ ˜ +µ ˜. = W AZ This property makes NVMDs and, in particular, MSSDs applicable to portfolio theory. The class of NVMD has the advantage that structural information about the mixing law W can be transferred to the mixture law. This is true, for example, for the property of infinite divisibility. If the mixing law is infinitely divisible, then so is the mixture law. (For further information see [2].) It is obvious from the definition that an α-stable sub-Gaussian random vector 2/α , 1, 0). is also a normal variance mixture with mixing law W ∼ Sα ((cos πα 4 ) 3.3 Market Risk Management with Elliptical Distributions In this section, we discuss the properties of elliptical distributions in terms of market risk management and portfolio optimization. In risk management, one is mainly interested in modeling the extreme losses which can occur. From empirical investigations, we know that an extreme loss in one asset very often occurs with high losses in many other assets. We show that this market behavior cannot be modeled by the normal distribution but, with certain elliptical distributions, e.g. α-stable sub-Gaussian distribution, we can capture this behavior. The Markowitz’s portfolio optimization approach which is originally based on the normal assumption can be extended to the class of elliptical distributions. Also, statistical dimensionality reduction methods such as the principal component analysis are applicable to them. But one must be careful, in contrast to the normal distribution, these principal components are not independent. Let F be the distribution function of the random variable X, then we call F ← (α) = inf{x ∈ R : F (x) ≥ α} the quantile function. F ← is also the called generalized inverse, since we have F (F ← (α)) = α, for any df F .
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
127
Definition 9. Let X1 and X2 be random variables with dfs F1 and F2 . The coefficient of the upper tail dependence of X1 and X2 is λu := λu (X1 , X2 ) := lim− P (X2 > F2← (q)|X1 > F1← (q)), q→1
(2)
provided a limit λu ∈ [0, 1] exists. If λu ∈ (0, 1], then X1 and X2 are said to show upper tail dependence; if λu = 0, they are asymptotically independent in the upper tail. Analogously, the coefficient of the lower tail dependence is λl = λl (X1 , X2 ) = lim+ P (X2 ≤ F ← (q)|X1 ≤ F1← (q)), q→0
(3)
provided a limit λl ∈ [0, 1] exists. For a better understanding of tail dependence we introduce the concept of copulas. Definition 10. A d-dimensional copula is a distribution function on [0, 1]d . It is easy to show that for U ∼ U (0, 1), we have P (F ← (U ) ≤ x) = F (x) and if the random variable Y has a continuous df G, then G(Y ) ∼ U (0, 1). The concept of copulas gained its importance because of Sklar’s Theorem. Theorem 5. Let F be a joint distribution function with margins F1 , . . . , Fd . Then, there exists a copula C : [0, 1]d → [0, 1] such that for all x1 , . . . , xd in R = [∞, ∞], F (x1 , . . . , xd ) = C(F1 (x1 ), . . . , Fd (xd )).
(4)
If the margins are continuous, then C is unique; otherwise C is uniquely determined on F1 (R) × F2 (R) × . . . × Fd (R). Conversely, if C is a copula and F1 , ..., Fd are univariate distribution functions, the function F defined in (4) is a joint distribution function with margins F1 , . . . , Fd . This fundamental theorem in the field of copulas, shows that any multivariate distribution F can be decomposed in a copula C and the marginal distributions of F . Vice versa, we can use a copula C and univariate dfs to construct a multivariate distribution function. With this short excursion in the theory of copulas we obtain a simpler expression for the upper and the lower tail dependencies, i.e., P (X2 ≤ F ← (q), X1 ≤ F1← (q)) P (X1 ≤ F1← (q)) q→0+ C(q, q) . = lim + q q→0
λl = lim
d
Elliptical distributions are radially symmetric, i.e., µ − X = µ + X, hence the coefficient of lower tail dependence λl equals the coefficient of upper tail dependence λu . We denote with λ the coefficient of tail dependence.
128
S. Kring et al.
We call a measurable function f : R+ → R+ regularly varying (at ∞) with index α ∈ R if, for any t > 0, limx→∞ f (tx)/f (x) = tα . It is now important to notice that regularly varying functions with index α ∈ R behave asymptotically like a power function. An elliptically distributed random vector X = RAU is said to be regularly varying with tail index α, if the function f (x) = P (R ≥ x) is regularly varying with tail index α. (see [18].) The following theorem shows the relation between the tail dependence coefficient and the tail index of elliptical distributions. Theorem 6. Let X ∼ Ed (µ, Σ, ψ) be regularly varying with tail index α > 0 and Σ a positive definite dispersion matrix. Then, every pair of components of X, say Xi and Xj , is tail dependent and the coefficient of tail dependence corresponds to 0 f (ρij ) sα √ 2 ds 0 λ(Xi , Xj ; α, ρij ) = 0 1 α1−s (5) √s ds 0 1−s2 4
where f (ρij ) =
1+ρij 2
√ and ρij = σij / σii σjj .
Proof. See [20] . It is not difficult to show that an α-stable sub-Gaussian distribution is regularly varying with tail index α. The coefficient of tail dependence between two components, say Xi and Xj , is determined by equation (5) in Theorem 6. In the next example, we demonstrate that the coefficient of tail dependence of a normal distribution is zero. Example 2. Let (X1 , X2 ) be a bivariate normal random vector with correlation ρ ∈ (−1, 1) and standard normal marginals. Let Cρ be the corresponding Gaussian copula due to Sklar’s theorem, then, by the L’Hˆ opital rule, Cρ (q, q) l H dCρ (q, q) Cρ (q + h, q + h) − Cρ (q, q) = lim+ lim+ = lim+ q dq h q→0 q→0 q→0 h→0 Cρ (q + h, q + h) − Cρ (q + h, q) + Cρ (q + h, q) − Cρ (q, q) = lim+ lim h q→0 h→0 P (U1 ≤ q + h, q ≤ U2 ≤ q + h)) = lim+ lim P (q ≤ U2 ≤ q + h) q→0 h→0 P (q ≤ U1 ≤ q + h, U2 ≤ q) + lim+ lim P (q ≤ U1 ≤ q + h) q→0 h→0 = lim P (U2 ≤ q|U1 = q) + lim P (U1 ≤ q|U2 = q)
λ = lim+
q→0+
q→0+
= 2 lim P (U2 ≤ q|U1 = q) q→0+
= 2 lim+ P (Φ−1 (U2 ) ≤ Φ−1 (q)|Φ−1 (U1 ) = Φ−1 (q)) q→0
= 2 lim P (X2 ≤ x|X1 = x) x→−∞
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
Since we have X2 |X1 = x ∼ N (ρx, 1 − ρ2 ), we obtain λ = 2 lim Φ(x 1 − ρ/ 1 + ρ) = 0 x→−∞
129
(6)
Equation (6) shows that beside the fact that a normal distribution is not heavy tailed the components are asymptotically independent. This, again, is a contradiction to empirical investigations of market behavior. Especially, in extreme market situations, when a financial market declines in value, market participants tend to behave homogeneously, i.e., they leave the market and sell their assets. This behavior causes losses in many assets simultaneously. This phenomenon can only be captured by distributions which are asymptotically dependent. [12] optimizes the risk and return behavior of a portfolio based on the expected returns and the covariances of the returns in the considered asset universe. The risk of a portfolio consisting of these assets is measured by the variance of the portfolio return. In addition, he assumes that the asset returns follow a multivariate normal distribution with mean µ and covariance Σ. This approach leads to the following optimization problem min w Σw,
w∈Rd
subject to w µ = µp w 1 = 1. This approach can be extended in two ways. First, we can replace the assumption of normally distributed asset returns by elliptically distributed asset returns and second, instead of using the variance as the risk measure, we can apply any positive-homogeneous, translation-invariant measure of risk to rank risk or to determine the optimal risk-minimizing portfolio. In general, due to the work of [1], a risk measure is a real-valued function : M → R, where M ⊂ L0 (Ω, F, P ) is a convex cone. L0 (Ω, F, P ) is the set of all almost surely finite random variables. The risk measure is translation invariant if for all L ∈ M and every l ∈ R, we have (L + l) = (L) + l. It is positivehomogeneous if for all λ > 0, we have (λL) = λ(L). Note, that value-at-risk (VaR) as well as conditional value-at-risk (CVaR) fulfill these two properties. Theorem 7. Let the random vector of asset returns X be Ed (µ, Σ, ψ). We d denote by W = {w ∈ Rd : Asi=1 wi = 1} the set of portfolio weights. d sume that the current value of the portfolio is V and let L(w) = V i=1 wi Xi be the (linearized) portfolio loss. Let be a real-valued risk measure depending only on the distribution of a risk. Suppose is positive homogeneous and translation invariant and let Y = {w ∈ W : −w µ = m} be the subset of portfolios giving expected return m. Then, argminw∈Y (L(w)) = argminw∈Y w Σw.
130
S. Kring et al.
Proof. See [14].
The last theorem stresses that the dispersion matrix contains all the information for the management of risk. In particular, the tail index of an elliptical random vector has no influence on optimizing risk. Of course, the index has an impact on the value of the particular risk measure like VaR or CVaR, but not on the weights of the optimal portfolio, due to the Markowitz approach. In risk management, we have very often to deal with portfolios consisting of many different assets. In many of these cases it is important to reduce the dimensionality of the problem in order to not only understand the portfolio’s risk but also to forecast the risk. A classical method to reduce the dimensionality of a portfolio whose assets are highly correlated is principal component analysis (PCA). PCA is based on the spectral decomposition theorem. Any symmetric or positive definite matrix Σ can be decomposed in Σ = P DP , where P is an orthogonal matrix consisting of the eigenvectors of Σ in its columns and D is a diagonal matrix of the eigenvalues of Σ. In addition, we demand λi ≥ λi−1 , i = 1, . . . , d for the eigenvalues of Σ in D. If we apply the spectral decomposition theorem to the dispersion matrix of an elliptical random vector X with distribution Ed (µ, Σ, ψ), we can interpret the principal components which are defined by Yi = Pi (X − µ), i = 1, . . . , d,
(7)
as the main statistical risk factors of the distribution of X in the following sense P1 ΣP1 = max{w Σw : w w = 1}.
(8)
More generally, Pi ΣPi = max{w Σw : w ∈ {P1 , . . . , Pi−1 }⊥ , w w = 1}. From equation (8), we can derive that the linear combination Y1 = P1 (X − µ) has the highest dispersion of all linear combinations and Pi X has the highest dispersion in the linear subspace {P1 , ..., Pi−1 }⊥ . If we interpret trace Σ = d j=1 σii as a measure of total variability in X and since we have d i=1
Pi ΣPi =
d i=1
λi = trace Σ =
d
σii ,
i=1
we can measure the ability of the first k components to explain the principal k d variability of X by the ratio j=1 λj / j=1 λj . Furthermore, we can use the principal components to construct a statistical factor model. Due to equation (7), we have
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
131
Y = P (X − µ), which can be inverted to X = µ + P Y. If we partition Y due to (Y1 , Y2 ) , where Y1 ∈ Rk and Y2 ∈ Rd−k and also P leading to (P1 , P2 ), where P1 ∈ Rd×k and P2 ∈ Rd×(d−k) , we obtain the representation X = µ + P1 Y1 + P2 Y2 = µ + P1 Y1 + . But one has to be careful. In contrast to the normal distribution case, the principal components are only quasi-uncorrelated but not independent. Furthermore, we obtain for the coefficient of tail dependence between two principal components, say Yi and Yj , 0 √1/2 sα √ 2 ds 0 . λ(Yi , Yj , 0, α) = 0 1 α1−s s √ ds 0 1−s2
4 Estimation of an α-Stable Sub-Gaussian Distributions In contrast to the general case of multivariate α-stable distributions, we show that the estimation of the parameters of an α-stable sub-Gaussian distribution is feasible. As shown in the last section, α-stable sub-Gaussian distributions belong to the class of elliptical distributions. In general, one can apply a twostep estimation procedure for the elliptical class. In the first step, we estimate independently the location parameter µ ∈ Rd and the positive definite dispersion matrix Σ up to a scale parameter. In the second step, we estimate the parameter of the radial random variable W . We apply this idea to α-stable sub-Gaussian distributions. In Sects. 4.1 and 4.2 we present our main theoretical results, deriving estimators for the dispersion matrix and proving their consistency. In Sect. 4.3 we present a new procedure to estimate the parameter α of an α-stable sub-Gaussian distribution. 4.1 Estimation of the Dispersion Matrix with Covariation In Sect. 2.1, we introduced the covariation of a multivariate α-stable random vector. This quantity allows us to derive a consistent estimator for an α-stable dispersion matrix. In order to shorten the notation we denote with σj = σ(ej ) the scale parameter of the jth component of an α-stable random vector X = (X1 , . . . , Xd ) ∈ Rd .
132
S. Kring et al.
Proposition 4. (a) Let X = (X1 , . . . , Xd ) ∈ Rd be a zero mean α-stable sub-Gaussian random vector with positive definite dispersion matrix Σ ∈ Rd×d . Then, we have σij =
2 σ(ej )2−p E(Xi Xj
), cα,0 (p)p
(9)
where p ∈ (1, α), cα,0 (p) = E(|Y |p )1/p > 0 and Y ∼ Sα (1, 0, 0). (b) Let X1 , X2 , . . . , Xn be independent and identically distributed samples with the same distribution as the random vector X. Let σ ˆj be a consistent estimator for σj , the scale parameter of the jth component of X, then, the (2) estimator σ ˆij (n, p), defined as 2 2−p 1
σ ˆ Xti Xtj , cα,0 (p)p j n t=1 n
(2)
σ ˆij (n, p) =
(10)
is a consistent estimator for σij , where Xti refers to the ith entries of the observation Xt , t = 1, . . . , n, cα,0 (p) = E(|Y |p )1/p and Y ∼ Sα (1, 0, 0). Proof. (a) Due to the Proposition 3 we have Prop. 3
=
2α/2 σjj
Prop.1
=
2α/2 σjj
Lemma 1
=
2α/2 σjj
Corollary 1(i)
2p/2 σjj
σij
=
(2−p)/2
[Xi , Xj ]α
(2−α)/2
E(Xi Xj
)σjα /E(|Xj |p )
(2−α)/2
E(Xi Xj
)σjα /(cα,0 (p)p σjp )
(2−p)/2
E(Xi Xj
)/(cα,0 (p)p )
(b) The estimator σ ˆj is consistent and f (x) = x2−p is continuous. Then, the n 2−p
estimator σ ˆj is consistent for σj2−p . n1 k=1 Xki Xkj is consistent for E(Xi Xj
) due to the law of large numbers. Since the product of two consistent estimators is consistent, the estimator 1 2
σ ˆj2−p Xti Xtj p cα,0 (p) n t=1 n
(2)
σ ˆij = = is consistent.
4.2 Estimation of the Dispersion Matrix with Moment-Type Estimators In this section, we present an approach of estimating the dispersion matrix up to a scale parameter which is applicable to the class of normal variance mixtures. In particular, we will see that if we know the tail parameter of an α-stable sub-Gaussian random vector X ∈ Rd , this approach allows us to estimate the dispersion matrix of X. We denote with (Wθ )θ∈Θ a parametric family of positive random variables.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
133
Lemma 2. Let Z ∈ Rd a normally distributed random vector with mean √ zero and positive definite dispersion matrix Σ ∈ Rd×d and let Xθ = µ + Wθ Z, θ ∈ Θ, be a d-dimensional normal variance mixture with location parameter √ µ ∈ Rd . Furthermore, we assume that Wθ has tail parameter α(θ), θ ∈ Θ.4 Then, there exists a function c : {(θ, p) ∈ Θ × R : p ∈ (0, α(θ))} → (0, ∞) such that, for all a ∈ Rd \ {0}, we have E(|a (Xθ − µ)|p ) = c(θ, p)p (a Σa)p/2 .
(11)
The function c is defined by p/2
c(θ, p) = E(Wθ
˜ p ), )E(|Z|
where the random vector Z˜ ∈ Rd is standard normally distributed. Furthermore, c satisfies lim c(θ, p) = 1,
(12)
p→0
for all θ ∈ Θ. We see from (11) that the covariance matrix of Z determines the dispersion matrix of Xθ up to a scaling constant. Proof. Let θ ∈ Θ, p ∈ (0, α(θ)) and a ∈ Rd \ {0}, then we have E(|a (Xθ − µ)|p ) = E(|a Wθ
1/2
p/2
= E(Wθ 5
Z|p )
)E(|a Z/(a Σa)1/2 |p )(a Σa)p/2 . 67 8 =:c(θ,p)
Note that Z˜ = a Z/(a Σa)1/2 is standard normally distributed, hence c(θ, p) p/2 is independent of a. Since E(Wθ ) > 0 and E(|a Z/(a Σa)1/2 |p ) > 0, so c(θ, p) > 0. Since we have xp ≤ max{1, xα(θ) } for p ∈ (0, α(θ)) and x > 0, it follows from Lebesque’s Theorem p ˜ p) lim c(θ, p) = lim E( Wθ ) lim E(|Z| p→0 p→0 p→0 p ˜ p) = E( lim Wθ )E( lim |Z| p→0
p→0
= E(1)E(1) = 1.
4
Note, if the random variable X has tail parameter α then E(|X|p ) < ∞ for all p < α and E(|X|p ) = ∞ for all p ≥ α (see [19]).
134
S. Kring et al.
Theorem 8. Let Z, Xθ , θ ∈ Θ, and c : {(θ, p) ∈ Θ × R : p ∈ (0, α(θ))} → (0, ∞) be as in Lemma 2. Let X1 , . . . , Xn ∈ Rd be i.i.d. samples with the same distribution as Xθ . The estimator 1 |a (Xi − µ)|p n i=1 c(θ, p)p n
σ ˆn (p, a) =
(13)
(i) is unbiased, i.e., E(ˆ σn (p, a)) = (a Σa)p/2 for all a ∈ Rd (ii)is consistent, i.e., P (|ˆ σn (p, a) − (a Σa)p/2 | > ) → 0 (n → ∞), if p < α(θ)/2. Proof. (i) follows directly from Lemma 2. For statement (ii), we have shown that P (n) := P (|ˆ σn (p, a) − (a Σa)p/2 | > ) → 0 (n → ∞). But this holds because of (∗)
1 Var(ˆ σn (p, a)) 2 ! n 1 p = 2 2 Var |a(Xi − µ)| n c(θ, p)2p i=1
P (n) ≤
1 Var(|a (X − µ)|p ) 2 nc(θ, p)2p 1 = 2 (E(|a (X − µ)|2p ) − E(|a (X − µ)|p )2 ) nc(θ, p)2p ) * 1 = 2 c(θ, 2p)2p (a Σa)2p − c(θ, p)2p (a Σa)2p nc(θ, p)2p !
2p c(θ, 2p) 1 − 1 (a Σa)2p → 0 (n → ∞). = 2 n c(θ, p) =
The inequation (∗) holds because of the Chebyshev’s inequality and we have E(|a (X − µ)|2p ) < ∞ because of the assumption p < α(θ)/2. Note, that σ ˆn (p, a)2/p , a ∈ Rd , is a biased, but consistent estimator for (aΣa ). However, since we cannot determine c(θ, p) > 0 we have to use 1 |a (Xi − µ)|p σ ˆn (p, a)c(θ, p) = n i=1 n
p
(14)
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
135
as the estimator. But then, Theorem 8 allows us the estimate the dispersion matrix only up to a scaling constant by using linear combinations a X1 , . . . , a Xn , a ∈ Rd of the observations X1 , . . . , Xn . We can apply two different approaches to do this. The first approach is based on the fact that the following equation holds (ei + ej ) Σ(ei + ej ) − (ei − ej ) Σ(ei − ej ) 4
σij =
for all 1 ≤ i < j ≤ d. Then, we can conclude that the estimator σn (p, ei − ej )2/p c(θ, p)ˆ σn (p, ei + ej )2/p − c(θ, p)ˆ 4
σ ˆij (n, p) :=
(15)
is a consistent estimator for σij up to the scaling constant c(θ, p), that is the same for all 1 ≤ i < j ≤ d. For the second approach we use different linear projections ai X1 , . . . , ai Xn , ai ∈ Rd , i = 1, . . . , m, of the observations in order to reconstruct Σ through the following optimization problem ˆ Σ(n, p) = argminΣ∈Rd×d :sym.
m
(c(θ, p)ˆ σn (p, ai )2/p − ai Σai )2 .
(16)
i=1
It is important to note that the optimization problem (16) can be solved by ordinary least squares regression. In the next theorem, we present an estimator that is based on the following observation. Letting Xθ , X1 , X2 , X3 , ... be a sequence of i.i.d. normal variance mixtures, then we have lim lim
n→∞ p→0
p !1/p n n ( 1
a Xi − µ(a)
(∗) = lim |a Xi − µ(a)|1/n n→∞ n i=1 c(θ, p) i=1 = (a Σa)1/2 .
The last equation is true because of (ii) of the following theorem. The proof of the equality (*) can be found in [21]. Theorem 9. Let Z, Xθ , θ ∈ Θ, and c : {(θ, p) ∈ Θ × R : p ∈ (0, α(θ))} → (0, ∞) be as in Lemma 2 and let X1 , . . . , Xn ∈ Rd be i.i.d. samples with the same distribution as Xθ . The estimator ( 1 |a (Xi − µ)|1/n c(θ, 1/n) i=1 n
σ ˆn (a) = (i) is unbiased, i.e.,
E(ˆ σn (a)) = (a Σa)1/2 for all a ∈ R
136
S. Kring et al.
(ii) is consistent, i.e., P (|ˆ σn (a) − (a Σa)1/2 | > ) → 0 (n → ∞). Proof. (i) follows directly from Lemma 2. For statement (ii), we have shown that P (n) := P (|ˆ σn (a) − (a Σa)p/2 | > ) → 0 (n → ∞). But this holds because of (∗)
1 Var(ˆ σn (a)) 2 ! n ( 1 1/n = 2 Var |a (Xi − µ)| c(θ, 1/n)2 i=1
P (n) ≤
1 = 2 c(θ, 1/n)2
n (
E(|a (Xi − µ)|
2/n
)−
i=1
n (
!
E(|a (Xi − µ)|
1/n 2
)
i=1
1 (E(|a (X − µ)|2/n )n − E(|a (X − µ)|1/n ))2n ) 2 c(θ, 1/n)2 1 = 2 (c(θ, 2/n)2 (a Σa)2 − (c(θ, 1/n)2 (a Σa)2 )) c(θ, 1/n)2 !
2 c(θ, 2/n) 1 = 2 − 1 (a Σa)2 → 0 (n → ∞). c(θ, 1/n) =
The inequation (∗) holds because of the Chebyshev’s inequality. Then (ii) follows from (12) in Lemma 2. Note, that σ ˆn2 (a), a ∈ Rd , is a biased but consistent estimator for (a Σa). For the rest of this section we concentrate on α-stable sub-Gaussian random vectors. In this case, the family of positive random variables (Wθ )θ∈Θ is given by (Wα )α∈(0,2) and Wα ∼ Sα/2 (cos(
πα ), 1, 0). 4
Furthermore, the scaling function c(., .) defined in Lemma 2 satisfies Γ ( p+1 2 )Γ (1 − p/α) √ Γ (1 − p/2) π + πp , + p, 2 Γ (p)Γ 1 − , = sin π 2 α
c(α, p)p = 2p
(17)
where Γ (.) is the Gamma-function. For the proof of (17), see [7] and [21]. With Theorems 8 and 9, we derive two estimators for the scale parameter σ(a) of the linear projection a X for an α-stable sub-Gaussian random vector X. The first one is
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
σ ˆn (p, a) =
1 n
137
−1 n + πp , + 2 p, sin |a Xi − µ(a)|p Γ (p) Γ 1 − π 2 α i=1
based on Theorem 8. The second one is ( 1 (|a Xi − µ(a)|)1/n c(α, 1/n) i=1
−n ( n + π , 1
2 1 sin Γ = · (|a Xi − µ(a)|)1/n . Γ 1− π 2n n nα i=1 n
σ ˆn (a) =
based on Theorem 9. We can reconstruct the stable dispersion matrix from the linear projections as shown in the (15) and (16). 4.3 Estimation of the Parameter α We assume that the data X1 , . . . , Xn ∈ Rd follow a sub-Gaussian α-stable distribution. We propose the following algorithm to obtain the underlying parameter α of the distribution. (i) Generate i.i.d. samples u1 , u2 , . . . , un according to the uniform distribution on the unit hypersphere S d−1 . (ii) For all i from 1 to n estimate the index of stability αi with respect to ˆ the data ui X1 , ui X2 , . . . , ui Xn , using an unbiased and fast estimator α for the index. (iii) Calculate the index of stability of the distribution by 1 α ˆk . n n
α ˆ=
k=1
The algorithm converges to the index of stability α of the distribution. (For further information we refer to [17].) 4.4 Simulation of α-Stable Sub-Gaussian Distributions Efficient and fast multivariate random number generators are indispensable for modern portfolio investigations. They are important for Monte-Carlo simulations for VaR, which have to be sampled in a reasonable time frame. For the class of elliptical distributions we present a fast and efficient algorithm which will be used for the simulation of α-stable sub-Gaussian distributions in the next section. We assume the dispersion matrix Σ to be positive definite. Hence we obtain for the Cholesky decomposition Σ = AA a unique full-rank lower-triangular matrix A ∈ Rd×d . We present a generic algorithm for generating multivariate elliptically-distributed random vectors. The algorithm is based on the stochastic representation of Corollary 2. For the generation of our samples, we use the following algorithm:
138
S. Kring et al.
Algorithm for ECr (µ, R; ψsub ) simulation (i) (ii) (iii) (iv) (v)
Set Σ = AA , via Cholesky decomposition. Sample a random number from W . Sample d independent random numbers Z1 , . . . , Zd from a N1 (0, 1) law. Set U = Z/||Z|| with √ Z = (Z1 , . . . , Zd ). Return X = µ + W AU
If we want to generate random number with a Ed (µ, Σ, ψsub ) law with the d
2/α algorithm, we choose W = Sα/2 (cos( πα , 1, 0)||Z||2 , where Z is Nd (0, Id) 4 ) distributed. It can be shown that ||Z||2 is independent of both W as well as Z/||Z||.
5 Emprical Analysis of the Estimators In this section, we evaluate two different estimators for the dispersion matrix of an α-stable sub-Gaussian distribution using boxplots. We are primarily interested in estimating the off-diagonal entries, since the diagonal entries σii are essentially only the square of the scale parameter σ. Estimators for the scale parameter σ have been analyzed in numerous studies. Due to Corollary 1 and Theorem 9, the estimator σn (ei − ej ))2 (ˆ σn (ei + ej ))2 − (ˆ 2 is a consistent estimator for σij and the second estimator (1)
σ ˆij (n) =
2
2−p 1 = σ ˆ (e ) Xki Xkj . n j cα,0 (p)p n
(18)
n
(2) σ ˆij (n, p)
(19)
k=1
is consistent because of proposition 4 for i = j. We analyze the estimators empirically. For an empirical evaluation of the estimators described above, it is sufficient to exploit the two-dimensional sub-Gaussian law since for estimating σij we only need the ith and jth component of the data X1 , X2 , . . . , Xn ∈ Rd . For a better understanding of the speed of convergence of the estimators, we choose different sample sizes (n = 100, 300, 500, 1000). Due to the fact that asset returns exhibit an index of stability in the range between 1.5 and 2, we only consider the values α = 1.5, 1.6, . . . , 1.9. For the empirical analysis of the estimators, we choose the matrix
12 A= . 34 The corresponding dispersion matrix is
5 11 . Σ = AA = 11 25
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
139
(1)
5.1 Empirical Analysis of σ ˆ ij (n) (1)
For the empirical analysis of σ ˆij (n), we generate samples as described in the previous paragraph and use the algorithm described in Sect. 4.4. The generated samples follow an α-stable sub-Gaussian distribution, i.e., Xi ∼ E2 (0, Σ, ψsub (., α)), i = 1, . . . , n, where A is defined above. Hence, the value of the off-diagonal entry of the dispersion matrix σ12 is 11. (1) In Figs. 4 through 7, we illustrate the behavior of the estimator σ ˆij (n) for several sample sizes and various values for the tail index, i.e., α = 1.5, 1.6, . . . , 1.9. We demonstrate the behavior of the estimator using boxplots based on 1,000 sample runs for each setting of sample length and parameter value. In general, one can see that for all values of α the estimators are medianunbiased. By analyzing the figures, we can additionally conclude that all estimators are slightly skewed to the right. Turning our attention to the rate of convergence of the estimates towards the median value of 11, we examine the boxplots. Figure 4 reveals that for a sample size of n = 100 the interquartile range is roughly equal to four for all values of α. The range diminishes gradually for increasing sample sizes until which can be seen in Figs. 4–7. Sample size per estimation=100 24 22 21 19
Values
17 15 13 11 9 7 5 alpha=1.5
alpha=1.6
alpha=1.7
alpha=1.8
alpha=1.9
Fig. 4. Sample size 100 Sample size per estimation=300 17
Values
15 13 11 9 7 alpha=1.5
alpha=1.6
alpha=1.7
alpha=1.8
Fig. 5. Sample size 300
alpha=1.9
140
S. Kring et al. Sample size per estimation=500
17 16 15
Values
14 13 12 11 10 9 8 7 alpha=1.5
alpha=1.6
alpha=1.7
alpha=1.8
alpha=1.9
Fig. 6. Sample size 500 Sample size per estimation=1000 14
Values
13 12 11 10 9
alpha=1.5
alpha=1.6
alpha=1.7
alpha=1.8
alpha=1.9
Fig. 7. Sample size 1000
Finally in Fig. 7, the interquartile range is equal to about 1.45 for all values of α. The rate of decay is roughly n−1/2 . Extreme outliers can be observed for small sample sizes larger than twice the median, regardless of the value of α. For n = 1, 000, we have a maximal error around about 1.5 times the median. Due to right-skewness, extreme values are observed mostly to the right of the median. (2)
5.2 Empirical Analysis of σ ˆ ij (n, p) We examine the consistency behavior of the second estimator as defined in (19) again using boxplots. In Fig. 5 through 12 we depict the statistical behavior of the estimator. For generating independent samples of various lengths for α = 1.5, 1.6, 1.7, 1.8, and 1.9, and two different values of p we use the algorithm described in Sect. 4.4.5 For the values of p, we select 1.0001 and 1.3, respectively. A value for p closer to one leads to improved properties of the estimator as will be seen. 5
In most of these plots, extreme estimates had to be removed to provide for a clear display of the boxplots.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
141
Sample size per estimation=100 30
Values
25
20
15
10
5 alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.00001
Fig. 8. Sample size 100, p = 1.00001
Sample size per estimation=100 35 30 25
Values
20 15 10 5 0 −5 −10 alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.3
Fig. 9. Sample size 100, p = 1.3
In general, we can observe that the estimates are strongly skewed. This is more pronounced for lower values of α while skewness vanishes slightly for increasing α. All figures display a noticeable bias in the median towards low (1) (2) ˆij (n, p). values. Finally, as will be seen, σ ˆij (n) seems more appealing than σ For a sample length of n = 100, Figs. 8 and 9 show that the bodies of the boxplots which are represented by the innerquartile ranges are as high as 4.5 for a lower value of p and α. As α increases, this effect vanishes slightly. However, results are worse for p = 1.3 as already indicated. For sample lengths of n = 300, Figs. 10 and 11 show interquartile ranges between 1.9 and 2.4 for lower values of p. Again, results are worse for p = 1.3. For n = 500, Figs. 12 and 13 reveal ranges between 1.3 and 2.3 as α increases. Again, this worsens when p increases. And finally for samples of length n = 1, 000, Figs. 14 and 15 indicate that for p = 1.00001 the interquartile ranges extend between 1 for α = 1.9 and 1.5 for α = 1.5. Depending on α, the same pattern but on a worse level is displayed for p = 1.3.
142
S. Kring et al. Sample size per estimation=300 18 16
Values
14 12 10 8 6 alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.00001
Fig. 10. Sample size 300, p = 1.00001 Sample size per estimation=300 30
25
Values
20
15
10
5 alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.3
Fig. 11. Sample size 300, p = 1.3 Sample size per estimation=500 16
Values
14
12
10
8
6
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.00001
Fig. 12. Sample size 500
It is clear from the statistical analysis that concerning skewness and me(1) (2) ˆij (n, p) dian bias, the estimator σ ˆij (n) has properties superior to estimator σ (1)
for both values of p. Hence, we use estimator σ ˆij (n).
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns Sample size per estimation=500 24 22 20
Values
18 16 14 12 10 8 6 4 alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.3
Fig. 13. Sample size 500 Sample size per estimation=1000 16 15 14
Values
13 12 11 10 9 8 7
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.00001
Fig. 14. Sample size 1000 Sample size per estimation=1000
25
Values
20
15
10
5 alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9 980 estimations of 1000 are shown in each boxplot; p=1.3
Fig. 15. Sample size 1000
143
144
S. Kring et al.
6 Application to the DAX 30 For the empirical analysis of the DAX30 index, we use the data from the Karlsruher Kapitaldatenbank. We analyze data from May 6, 2002 to March 31, 2006. For each company listed in the DAX30, we consider 1, 000 daily log-returns in the study period.6 6.1 Model Check and Estimation of the Parameter α Before fitting an α-stable sub-Gaussian distribution, we assessed if the data are appropriate for a sub-Gaussian model. This can be done with at least two different methods. In the first method, we analyze the data by pursuing the following steps (also [16]): (i) (ii) (iii) (iv) (v)
αi , βˆi , σ ˆi , µ ˆi ), i = 1, . . . , d. For every stock Xi , we estimate θˆ = (ˆ The estimated α ˆ i ’s should not differ much from each other. The estimated βˆi ’s should be close to zero. Bivariate scatterplots of the components should be elliptically contoured. If the data fulfill criteria (ii)-(iv), a sub-Gaussian model can be justified. If there is a strong discrepancy to one of these criteria we have to reject a sub-Gaussian model.
In Table 1, we depict the maximum likelihood estimates for the DAX30 components. The estimated α ˆ i , i = 1, . . . , 29, are significantly below 2, indicating leptokurtosis. We calculate the average to be α ¯ = 1.6. These estimates agree with earlier results from [8]. In that work, stocks of the DAX30 are analyzed during the period 1988 through 2002. Although using different estimation procedures, the results coincide in most cases. The estimated βˆi , ¯ equals i = 1, . . . , 29, are between −0.1756 and 0.1963 and the average, β, −0.0129. Observe the substantial variability in the α’s and that not all β’s are close to zero. These results agree with [16] who analyzed the Dow Jones Industrial Average. Concerning item (iv), it is certainly not feasible to look at each bivariate scatterplot of the data. Figure 16 depicts randomly chosen bivariate plots. Both scatterplots are roughly elliptical contoured. The second method to analyze if a dataset allows for a sub-Gaussian model is quite similar to the first one. Instead of considering the components of the DAX30 directly, we examine randomly chosen linear combinations of the components. We only demand that the Euclidean norm of the weights of the linear combination is 1. Due to the theory of α-stable sub-Gaussian distributions, the index of stability is invariant under linear combinations. Furthermore, the estimated βˆ of linear combination should be close to zero under the subGaussian assumption. These considerations lead us to the following model check procedure: 6
During our period of analysis Hypo Real Estate Holding AG was in the DAX for only 630 days. Therefore we exclude this company from further treatment leaving us with 29 stocks.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
145
Table 1. Stable parameter estimates using the maximum likelihood estimator Name Addidas Allianz Atlanta BASF BMW Bayer Commerzbank Continental Daimler-Chryser Deutsch Bank Deutsche Brse Deutsche Post Telekom Eon FresenMed Henkel Infineon Linde Lufthansa Man Metro MncherRck RWE SAP Schering Siemens Thyssen Tui Volkswagen
Ticker symbol
α ˆ
ADS ALV ALT BAS BMW BAY CBK CON DCX DBK DB1 DPW DTE EOA FME HEN3 IFX LIN LHA MAN MEO MUV2 RWE SAP SCH SIE TKA TUI VOW
1.716 1.515 1.419 1.674 1.595 1.576 1.534 1.766 1.675 1.634 1.741 1.778 1.350 1.594 1.487 1.634 1.618 1.534 1.670 1.684 1.526 1.376 1.744 1.415 1.494 1.574 1.650 1.538 1.690
Average values
(i) (ii) (iii) (iv) (v) (vi)
βˆ 0.196 −0.176 0.012 −0.070 −0.108 −0.077 0.054 0.012 −0.013 −0.084 0.049 −0.071 0.030 −0.069 0.029 0.103 0.019 0.063 0.030 −0.074 0.125 −0.070 −0.004 −0.093 −0.045 −0.125 −0.027 0.035 −0.024
σ ˆ 0.009 0.013 0.009 0.009 0.010 0.011 0.012 0.011 0.011 0.011 0.010 0.011 0.009 0.009 0.010 0.008 0.017 0.009 0.012 0.013 0.011 0.011 0.010 0.011 0.009 0.011 0.011 0.012 0.012
µ ˆ 0.001 −0.001 0.000 0.000 0.000 0.000 0.001 0.002 0.000 0.000 0.001 0.000 0.000 0.000 0.001 0.000 −0.001 0.000 −0.001 0.001 0.001 −0.001 0.000 −0.001 0.000 0.000 0.000 −0.001 0.000
α ¯ = 1, 6 β¯ = −0, 0129
Generate i.i.d. samples u1 , . . . , un ∈ Rd according to the uniform distribution on the hypersphere Sd−1 . αi , βˆi , For each linear combination ui X, i = 1, . . . , n, estimate θi = (ˆ ˆi ). σ ˆi , µ The estimated α ˆ i ’s should not differ much from each other. The estimated βˆi ’s should be close to zero. Bivariate scatterplots of the components should be elliptically contoured. If the data fulfill criteria (ii)–(v) a sub-Gaussian model can be justified.
If we conclude after the model check that our data are sub-Gaussian distributed, we estimate the α of the distribution by taking the mean α ¯ = n 1 α ˆ . This approach has the advantage compared to the former one i i=1 n
146
S. Kring et al.
0.2
0.2 0.15
0.15
0.1
MAN
LHF
0.1
0.05
0.05 0
0
−0.05 −0.05
−0.1
−0.1 −0.08 −0.06 −0.04 −0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
−0.08 −0.06 −0.04 −0.02
0
0.02
0.04
0.06
0.08
0.1
CON
BAS
(a)
(b)
Fig. 16. Bivariate Scatterplots of BASF and Lufthansa in (a); and of Continental and MAN in (b) 2
1
1.9
0.8
1.8
0.6
1.7
0.4
1.6
0.2
1.5
0
1.4
−0.2
1.3
−0.4
1.2
−0.6
1.1
−0.8
1
0
20
40
60
α
80
100
−1
0
20
40 60 Linear combinations
80
100
β
Fig. 17. Scatterplot of the estimated α’s and β’s for 100 linear combinations
that we incorporate more information from the dataset and we can generate more sample estimates α ˆ i and βˆi . In the former approach, we analyze only the marginal distributions. Figure 17 depicts the maximum likelihood estimates for 100 linear combinations due to (ii). We observe that the estimated α ˆ i , i = 1, . . . , n, range from 1.5 to 1.84. The average, α ¯ , equals 1.69. Compared to the first approach, the tail indices increase, meaning less leptokurtosis, but the range of the estimates decreases. The estimated βˆi ’s, i = 1, . . . , n, lie in a range of −0.4 and ¯ is −0.0129. In contrast to the first approach, the vari0.4 and the average, β, ability in the β’s increases. It is certainly not to be expected that the DAX30 log-returns follow a pure i.i.d. α stable sub-Gaussian model, since we do not account for time dependencies of the returns. The variability of the estimated α ˆ ’s might be explained with GARCH-effects such as clustering of volatility.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
147
The observed skewness in the data7 cannot be captured by a sub-Gaussian or any elliptical model. Nevertheless, we observe that the mean of the β’s is close to zero. 6.2 Estimation of the Stable DAX30 Dispersion Matrix In this section, we focus on estimating the sample dispersion matrix of an αstable sub-Gaussian distribution based on the DAX30 data. For the estimation (1) procedure, we use the estimator σ ˆij (n), i = j presented in Sect. 5. Before applying this estimator, we center each time series by subtracting its sample (1) mean. Estimator σ ˆij (n) has the disadvantage that it cannot handle zeros. But after centering the data, there are no zero log-returns in the time series. In general, this is a point which has to be considered carefully. For the sake of clearity, we display the sample dispersion matrix and covariance matrix as heat maps, respectively. Figure 18 is a heat map of the sample dispersion matrix of the α-stable sub-Gaussian distribution. The sample dispersion matrix is positive definite and has a very similar shape and structure as the sample covariance matrix which is depicted in Fig. 19. Dark blue colors correspond to low values, whereas dark red colors depict high values.
TUI
VOW
RWE SAP SCH SIE TKA
MAN
MEO MUV2
LIN LHA
IFX
EOA FME HEN3
DPW DTE
DBK DB1
CON DCX
BAY CBK
BMW
ALT BAS
ADS ALV
ADS ALV ALT BAS BMW BAY CBK CON DCX DBK DB1 DPW DTE EOA FME HEN3 IFX LIN LHA MAN MEO MUV2 RWE SAP SCH SIE TKA TUI VOW
Fig. 18. Heat map of the sample dispersion matrix. Dark blue colors corresponds to low values (min=0.0000278), to blue, to green, to yellow, to red for high values (max=0,00051)8 7 8
ˆ differ sometimes significantly from zero. The estimated β’s To obtain the heat map in color, please contact the authors.
148
S. Kring et al.
VOW
TKA TUI
SAP SCH SIE
MAN
MEO MUV2 RWE
LIN LHA
FME HEN3 IFX
EOA
CON DCX DBK DB1 DPW DTE
BMW BAY CBK
ALT BAS
ADS ALV
ADS ALV ALT BAS BMW BAY CBK CON DCX DBK DB1 DPW DTE EOA FME HEN3 IFX LIN LHA MAN MEO MUV2 RWE SAP SCH SIE TKA TUI VOW
Fig. 19. Heat map of the sample covariance matrix. Dark blue colors corresponds to low values (min=0.000053), to blue, to green, to yellow, to red for high values (max=0,00097)9 −3
x 10
7
3
6
2.5
5
Variance
Dispersion
3.5
2
1.5
3
2
0.5
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
(a)
−3
4
1
0
x 10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
(b)
Fig. 20. Barplots (a) and (b) depict the eigenvalues of the sample dispersion matrix and the sample covariance matrix
Figure 20a,b illustrate the eigenvalues λi , i = 1, . . . , 29, of the sample dispersion matrix and covariance matrix, respectively. In both Figures, the first eigenvalue is significantly larger than the others. The amounts of the eigenvectors decline in similar fashion.
9
To obtain the heat map in color, please contact the authors.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
90%
80%
80%
Explained variance in percent
100%
90%
Explained dispersion in percent
100 %
70% 60% 50% 40% 30% 20%
70% 60% 50% 40% 30% 20% 10%
10% 0%
149
0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
(a)
1 2 3 4 5 6 7 8 9 10 11 1213 14 15 16 17 1819 20 21 22 23 2425 26 27 28 29
(b)
Fig. 21. Barplot (a) and (b) show the cumulative proportion of the total dispersion and variance explained by the components, i.e., ki=1 λi / 29 i=1 λi
Figure 21 (a) and (b) depict the cumulative proportion of the total variability explained by the first k principal components corresponding to the k largest eigenvalues. In both figures, more than 50% is explained by the first principal component. We observe that the first principal component in the stable case explains slightly more variability than in the ordinary case, e.g. 70% of the total amount of dispersion is captured by the first six stable components whereas in the normal case, only 65% is explained. In contrast to the normal PCA the stable components are not independent but quasi-uncorrelated. Furthermore, in the case of α = 1.69, the coefficient of tail dependence for two principal components, say Yi and Yj , is 0 √1/2 s1.69 √ ds 0 1−s2 ≈ 0.21 λ(Yi , Yj , 0, 1.69) = 0 1 1.69 s √ ds 0 1−s2 due to Theorem 6 for all i = j, i, j = 1, . . . , 29. In Fig. 22a–d we show the first four eigenvectors of the sample dispersion matrix, the so-called vectors of loadings. The first vector is positively weighted for all stocks and can be thought of as describing a kind of index portfolio. The weights of this vector do not sum to one but they can be scaled to be so. The second vector has positive weights for technology titles such as Deutsche Telekom, Infineon, SAP, Siemens and also to the non-technology companies Allianz, Commerzbank, and Tui. The second principal component can be regarded as a trading strategy of buying technology titles and selling the other DAX30 stocks except for Allianz, Commerzbank, and Tui. The first two principal components explain around 56% of the total variability. The vectors of loadings in (c) and (d) correspond to the third and fourth principal component, respectively. It is slightly difficult to interpret this with respect to any economic meaning, hence, we consider them as pure statistical quantities. In conclusion, the estimator σ ˆij (n), i = j, offers a simple way to estimate
150
S. Kring et al. ADS ALV ALT BAS BMW BAY CBK CON DCX DBK DB1 DPW DTE EOA FME HEN3 IFX LIN LHA MAN MEO MUV2 RWE SAP SCH SIE TKA TUI VOW
ADS ALV ALT BAS BMW BAY CBK CON DCX DBK DB1 DPW DTE EOA FME HEN3 IFX LIN LHA MAN MEO MUV2 RWE SAP SCH SIE TKA TUI VOW 0
0.05
0.1
0.15
0.2
0.3
0.25
−0.2
−0.1
0
0.1
(a)
0.2
0.3
0.4
0.5
0.6
0.7
0.8
(b)
ADS ALV ALT BAS BMW BAY CBK CON DCX DBK DB1 DPW DTE EOA FME HEN3 IFX LIN LHA MAN MEO MUV2 RWE SAP SCH SIE TKA TUI VOW
ADS ALV ALT BAS BMW BAY CBK CON DCX DBK DB1 DPW DTE EOA FME HEN3 IFX LIN LHA MAN MEO MUV2 RWE SAP SCH SIE TKA TUI VOW
−0.5
−0.4
−0.3
−0.2
(c)
−0.1
0
0.1
0.2
0.3
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
(d)
Fig. 22. Barplot summarizing the loadings vectors g1 , g2 , g3 and g4 defining the first four principal components: (a) factor 1 loadings; (b) factor 2 loadings; (c) factor 3 loadings; and (d) factor 4 loadings
the dispersion matrix in an i.i.d. α-stable sub-Gaussian model. The results delivered by the estimator are reasonable and consistent with economic theory. Finally, we stress that a stable PCA is feasible.
7 Conclusion In this paper we present different estimators which allow one to estimate the dispersion matrix of any normal variance mixture distribution. We analyze the estimators theoretically and show their consistency. We find empirically (1) that the estimator σ ˆij (n) has better statistical properties than the estima(2)
tor σ ˆij (n, p) for i = j. We fit an α-stable sub-Gaussian distribution to the DAX30 components for the first time. The sub-Gaussian model is certainly more realistic than a normal model, since it captures tail dependencies. But it has still the drawback that it cannot incorporate time dependencies.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns
151
Acknowledgement The authors would like to thank Stoyan Stoyanov and Borjana Racheva-Iotova from FinAnalytica Inc for providing ML-estimators encoded in MATLAB. For further information, see [22].
References [1] Artzner, P., F. Delbaen, J.M. Eber and D. Heath. 1999. Coherent Measure of Risk. Mathematical Finance 9, 203–228. [2] Bingham, N.H., R. Kiesel and R. Schmidt. 2003. A Semi-parametric Approach to Risk Management. Quantitative Finance 3, 241–250. [3] Cheng, B.N., S.T. Rachev. 1995. Multivariate Stable Securities in Financial Markets. Mathematical Finance 54, 133–153. [4] Embrechts, P., A. McNeil and D. Straumann. 1999. Correlation: Pitfalls and Alternatives. Risk 5, 69–71. [5] Fama, E. 1965. The Behavior of Stock Market Prices. Journal of Business, 38, 34–105. [6] Fama, E. 1965. Portfolio Analysis in a Stable Paretian market. Management Science, 11, 404–419. [7] Hardin Jr., C.D. 1984. Skewed Stable Variables and Processes. Technical report 79, Center for Stochastics Processes at the University of North Carolina, Chapel Hill. [8] H¨ ochst¨otter, M., F.J. Fabozzi and S.T. Rachev. 2005. Distributional Analysis of the Stocks Comprising the DAX 30. Probability and Mathematical Statistics, 25, 363–383. [9] Mandelbrot, B.B. 1963. New Methods in Statistical Economics. Journal of Political Economy, 71, 421–440. [10] Mandelbrot, B.B. 1963. The Variation of Certain Speculative Prices. Journal of Business, 36, 394–419. [11] Mandelbrot, B.B. 1963. New Methods in Statistical Economics. Journal of Political Economy, 71, 421–440. [12] Markowitz, H.M. 1952. Portfolio Selection. Journal of Finance 7, (1), 77–91. [13] McCulloch, J.H. 1952. Financial Applications of Stable Distributions. Hanbbook of Statistics-Statistical Methods in Finance, 14, 393–425. Elsevier Science B.V, Amsterdam. [14] McNeil, A.J., R. Frey and P. Embrechts. 2005. Quantitative Risk Management, Princeton University Press, Princeton. [15] Nolan, J.P., A.K. Panorska, J.H. McCulloch. 2000. Estimation of Stable Spectral Measure. Mathematical and Computer Modeling 34, 1113–1122. [16] Nolan, J.P. 2005. Multivariate Stable Densities and Distribution Functions: General and Elliptical Case. Deutsche Bundesbank’s 2005 Annual Fall Conference.
152
S. Kring et al.
[17] Rachev, S.T., S. Mittnik. 2000. Stable Paretian Models in Finance. Wiley, New York. [18] Resnick, S.I. 1987. Extreme Values, Regular Variation, and Point Processes. Springer, Berlin. [19] Samorodnitsky, G., M. Taqqu. 1994. Stable Non-Gaussian Random Processes, Chapmann & Hall, New York [20] Schmidt, R. 2002. Tail Dependence for Elliptically Contoured Distributions. Mathematical Methods of Operations Research 55, 301–327. [21] Stoyanov, S.V. 2005. Optimal Portfolio Management in Highly Volatile Markets. Ph.D. thesis, University of Karlsruhe, Germany. [22] Stoyanov, S.V., B. Racheva-Iotova 2004. Univariate Stable Laws in the Fields of Finance-Approximations of Density and Distribution Functions. Journal of Concrete and Applicable Mathematics, 2/1, 38–57.
Risk Measures for Portfolio Vectors and Allocation of Risks Ludger R¨ uschendorf Department of Mathematical Statistics, University of Freiburg, Germany, [email protected]
1 Introduction In this paper we survey some recent developments on risk measures for portfolio vectors and on the allocation of risk problem. The main purpose to study risk measures for portfolio vectors X = (X1 , . . . , Xd ) is to measure not only the risk of the marginals separately but to measure the joint risk of X caused by the variation of the components and their possible dependence. Thus an important property of risk measures for portfolio vectors is consistency with respect to various classes of convex and dependence orderings. It turns out that axiomatically defined convex risk measures are consistent w.r.t. multivariate convex ordering. Two types of examples of risk measures for portfolio measures are introduced and their consistency properties are investigated w.r.t. various types of convex resp. dependence orderings. We introduce the general class of convex risk measures for portfolio vectors. These have a representation result based on penalized scenario measures. It turns out that maximal correlation risk measures play in the portfolio case the same role that average value at risk measures have in one dimensional case. The second part is concerned with applications of risk measures to the optimal risk allocation problem. The optimal risk allocation problem or, equivalently, the problem of risk sharing is the problem to allocate a risk in an optimal way to n traders endowed with risk measures 1 , . . . , n . This problem has a long history in mathematical economics and insurance. We show that the optimal risk allocation problem is well defined only under an equilibrium condition. This condition can be characterized by the existence of a common scenario measure. A meaningful modification of the optimal risk allocation problem can be given also for markets without assuming the equilibrium condition. Optimal solutions are characterized by a suitable dual formulation. The basic idea of this extension is to restrict the class of admissible allocations in a proper way. We also discuss briefly some variants of the risk allocation problem as the capital allocation problem.
154
L. R¨ uschendorf
2 Representation of Convex Risk Measures for Portfolio Vectors Convex risk measures for real risk variables have been axiomatically introduced and studied in the mathematical finance literature by Artzner et al. (1998), Delbaen (2002), F¨ ollmer and Schied (2004) and many others while there are independent and earlier studies of various aspects of risk measures and related premium principles in the economics and insurance literature. Various important subclasses of risk measures have been characterized. Law invariant, convex risk measures on L∞ (P ) (resp. Lr (P ), r ≥ 1) have been characterized by a Kusuoka type representation of the form + , (X) = sup AV @Rλ (X)µ(dλ) − β(µ) (1) µ∈M1 ([0,1])
(0,1)
where λ (X) = AV @Rλ (X) is the average value at risk 0 (also called expected shortfall or conditional value at risk), β(µ) = supX∈A (0,1] AV @Rλ (X)µ(dλ) is the penalty function, and A = {X ∈ L∞ (P ); (X) ≤ 0} is the acceptance set of (see Kusuoka (2001) and F¨ ollmer and Schied (2004)). Thus in dimension d = 1 the average value at risk measures λ are the basic building blocks of the class of law invariant convex risk measures. For some recent developments in the area of risk measures see [21]. For portfolio vectors X = (X1 , . . . , Xd ) ∈ L∞ d (P ) on (Ω, A, P ) a risk 1 (P ) → R is called convex risk measure if measure : L∞ d M1) X ≥ Y ⇒ (X) ≤ (Y ) M2) (X + mei ) = −m + (X), m ∈ R1 M3) (αX + (1 − α)Y ) ≤ α(X) + (1 − α)(Y ) for all α ∈ (0, 1); thus is a monotone translation invariant, convex risk functional (see [8, 17]). Like in d = 1 (X) denotes the smallest amount m to be added to the portfolio vector X such that X +me1 is acceptable. ei denotes here the i-th unit vector. d A subset A ⊂ L∞ d (P ) with R not contained in A is called (convex) acceptance set, if (A1) A is closed (and convex) (A2) Y ∈ A and Y ≤ X implies X ∈ A (A3) X + mei ∈ A ⇔ X + mej ∈ A. With A (X) := inf{m ∈ R; X + me1 ∈ A} risk measures are identified with their acceptance sets: (a) If A is a convex acceptance set, then A is a convex risk measure (b) If is a convex risk measure, then A = {X ∈ L∞ d ; (X) ≤ 0} is a convex acceptance set.
Risk Measures for Portfolio Vectors and Allocation of Risks
155
Let bad (P ) denote the set of finite additive, normed, positive measures on L∞ d (P ). Convex risk measures on portfolio vectors allow a representation similar to d = 1. 1 Theorem 1 (see [8]). : L∞ d (P ) → R is a convex risk measure if and only if there exists some function α : bad (P ) → (−∞, ∞] such that
(X) =
(EQ (−X) − α(Q)).
sup
(2)
Q∈bad (P )
α can be chosen as Legendre–Fenchel inverse α(Q) =
sup
X∈L∞ d (P )
(EQ (−X) − (X))
= sup EQ (−X). X∈A P
For risk measures which are Fatou-continuous, i.e. Xn → X, (Xn ) uniformly bounded implies (X) ≤ lim inf (Xn ), bad (P ) can be replaced by the class Md1 (P ) of P -continuous, σ-additive normed measures which can be identified by the class of P -densities D = {(Y1 , . . . , Yd ); Yi ≥ 0, EP Yi = 1, 1 ≤ i ≤ d}. For coherent risk measures, i.e. homogeneous, subadditive, monotone, translation invariant risk measures the representation in (2) simplifies to (X) = sup EQ (−X),
(3)
Q∈P
where P ⊂ ba(P ), resp. P ⊂ Md1 (P ), if the Fatou property holds, can be interpreted as class of scenario measures. d it implies (X) = (X) For law invariant convex risk measures, i.e. X = X has been found recently that maximal correlation risk measures play the role of basic building blocks as the average value at risk measures do in the Kusuoka representation result. Let for some density vector Y ∈ D, ΨY (X) = EX · Y denote the correlation of X and Y (up to normalization) and define d = X} ΨY (X) = sup{ΨY (Y ); X
(4)
the maximal correlation risk measure (in direction Y ). Theorem 2 (see [26]). Let Ψ be a Fatou continuous convex risk measure on L∞ d (P ) with penalty function α. Then it holds: Ψ is law invariant ⇔ Ψ has a representation of the form Ψ (X) = sup (ΨY (X) − α(Y )) Y ∈D0
with law invariant penalty function α and D0 = {Y ∈ D; α(Y ) < ∞}.
(5)
156
L. R¨ uschendorf
Remark 3 a) In particular, the law invariant coherent risk measures on L∞ d (P ) have a representation of the form Ψ (X) = sup ΨY (X)
(6)
Y ∈A
for some subset A ⊂ D. Thus the maximal correlation risk measures ΨY are the basic building blocks of all law invariant convex risk measures on portfolio vectors. b) For d = 1 the representation in (5) can be shown to be equivalent to the Kusuoka representation result in (1). For d ≥ 1 optimal couplings as in the definition of the maximal correlation risk measure ΨY arise, have been characterized in R¨ uschendorf and Rachev (1990). There are some examples where ΨY can be calculated in explicit form but in general one does not have explicit formulas. Therefore, it is useful to give more explicit constructions of risk measures for portfolio vectors which generalize the known classes of one dimensional risk measures. For some partial extensions of distortion type risk measures see [8, 26].
3 Consistency w.r.t. Convex Orderings and some Classes of Examples For some class of functions F ⊂ {f : Rd → R1 } the ordering ≤F is defined for random vectors X, Y by X ≤F Y
if Ef (X) ≤ Ef (Y ),
∀f ∈ F,
(7)
such that the integrals exist. In particular for the class of nondecreasing functions this leads to the stochastic ordering ≤st , for the class Fcx of convex functions this leads to the convex ordering ≤cx . Interesting dependence orderings are by the classes Fdcx of directionally convex functions, Fsm the class of supermodular functions, F∆ the class of ∆-monotone functions. The corresponding orderings are deuller and Stoyan (2002) for details on these noted by ≤dcx , ≤sm , ≤∆ (see M¨ orderings). From Strassen’s well-known representation result it follows that any risk measure on L∞ d (P ), which satisfies the monotonicity condition M1) is consistent w.r.t. stochastic ordering ≤st , i.e. X ≤st Y ⇒ (Y ) ≤ (X).
(8)
It is of particular interest to study consistency of risk measures w.r.t. the above mentioned various convexity and dependence orderings. Let ≤decx , ≤icx denote the ordering by decreasing resp. increasing convex functions. Then it turns out that all law invariant axiomatically defined convex risk measures are consistent w.r.t. decreasing convex ordering ≤decx (see [8]).
Risk Measures for Portfolio Vectors and Allocation of Risks
157
Theorem 4. Let be a law invariant, Fatou continuous convex risk measure on L∞ d (P ). Then is consistent w.r.t. ≤decx , i.e. X ≤decx Y ⇒ (X) ≤ (Y ).
(9)
Since X ≤decx Y is equivalent to Y ≤icv X, ≤icv the ordering by increasing concave functions, (9) is equivalent for d = 1 with consistency w.r.t. the second order stochastic dominance. The proof of Theorem 4 is based essentially on the following important property: For all X, Y ∈ L∞ d (P ) holds (X) ≥ (E(X | Y )),
(10)
i.e. smoothing by conditional expectation reduces the risk (for d = 1 see Schied (2004) or F¨ ollmer and Schied (2004)). In insurance mathematics the monotonicity axiom M1) of a risk measure has to be changed to monotonicity in the usual componentwise ordering. We shall use the notation Ψ (X) for risk measures satisfying this kind of monotonicity. The relation Ψ (X) = (−X) gives a one to one relation between risk measures in the financial context and risk measures Ψ in the insurance context. A natural idea to construct risk measures for portfolio vectors X is to measure the risk of some real aggregation of the risk vector like the joint portfolio or the maximal risk, i.e. to consider Ψ (X) = Ψ1
d +
, Xi
or
i=1
(11)
Ψ (X) = Ψ1 (max Xi ), i
where Ψ1 is a suitable one dimensional risk measure like expected shortfall or some distortion type risk measure. More generally for some class of real aggregation functions F0 = {fα ; α ∈ A} the following classes of risk measures have been introduced in Burgert and R¨ uschendorf (2006). Define ΨA (X) = sup Ψ1 (fα (X)), α∈A ΨM (X) = sup Ψ1 (fα (X))dµ(α),
(12) (13)
µ∈M
where M ⊂ Mσ (A) is a class of weighting measures on A. ΨA (X) is the maximal risk of some class of aggregation functions, while ΨM (X) considers the maximum risk over some weighted average. If for example A = ∆ = {α ∈ d Rd+ ; 0 i=1 αi = 1}, then one gets in this 0way risk measures like supα∈∆ Ψ1 (α · X), Ψ1 (α·X)dµ(α), Ψ1 (maxi αi Xi ) or ∆ Ψ1 (max αi Xi )dµ(α) measuring the risk in all positive directions α. It is important to assume that Ψ1 is consistent with respect to ≤icx – the increasing convex ordering. This is e.g. the case for distortion risk measures
158
L. R¨ uschendorf
0∞ Ψ1 (X) = 0 g(F X (t))dt where g is a concave distortion function and F X (t) = 1 − FX (t) is the survival function. Then the following consistency results hold true (see [8]): a) b)
If F0 ⊂ Ficx , then ΨA , ΨM are consistent w.r.t. ≤icx . (14) If F0 ⊂ Fism , (Fidcx ), then ΨA , ΨM are consistent w.r.t. ≤ism (≤idcx ). (15)
As consequence of a), b) one gets that more positive dependent risk vectors have higher risks. This extends some classical results on comparison of risk vectors. Let Fi−1 denote the generalized inverse of the distribution function Fi of Xi , then d
Xi ≤icx
i=1
d
Fi−1 (U ),
(16)
i=1
where U is uniformly distributed on [0, 1] (see Meilijson and Nadas (1979), R¨ uschendorf (1983)). Further with the comonotonic vector X c := (F1−1 (U ), . . . , Fd−1 (U )) holds the following basic comparison result wich extends (16) X ≤sm X c
and
X ≤∆ X c
(17)
(see Tchen (1980) and R¨ uschendorf (1980)). Thus as consequence of (15) and (17) we conclude under the conditions of (14), (15) ΨM (X) ≤ ΨM (X c ),
ΨA (X) ≤ ΨA (X c );
(18)
the comonotonic risk vector leads to the highest possible risk under all risk measures of type ΨM , ΨA . Extensions of (17) to compare risks also of two risk vectors X, Y are given in [11, 24]. For a review of this type of comparison results for risk vectors see the survey paper [25].
4 Risk Allocation and Equilibrium The classical risk sharing problem is to consider a market, described by some probability space (Ω, A, P ), and n traders in the market supplied with risk ∞ measures 1 , . . . , n . The problem nis to allocate a risk X ∈ L (P ) in an optimal way to the traders X = i=1 Xi , such that the risk vector (i (Xi )) is nPareto optimal in the class of all allocations or such that the total risk i=1 i (Xi ) is minimal under all allocations. This problem goes back to early work in the economics and insurance literature (see the early contributions of Borch (1960a,b, 1962), B¨ uhlmann
Risk Measures for Portfolio Vectors and Allocation of Risks
159
and Jewell (1979), Chevallier and M¨ uller (1994), and many others). It was later on extended to risk allocations in financial context (see e.g. Barrieu and El Karoui (2005) and references therein. An interesting point is that for translation invariant risk measures i , 1 ≤ i ≤ n, the principle of Pareto optimal risk allocations is equivalent to minimizing the total risk. This follows from the separating hyperplane theorem and some simple arguments involving translation invariance. In particular solutions are not unique and several additional (game theoretic) postulates like fairness have been introduced to single out specific solutions of the risk sharing problem. For example Chevallier and M¨ uller (1994) single out conditions which yield as possible solutions only portfolio insurance, tactical asset allocation, and collar strategies. Classical results are the derivation of linear quota sharing rules and of stop loss contracts as optimal sharing rules. We discuss in the following some developments on the risk allocation problem in the case where i are coherent risk measures with representation i (X) = supQ∈Pi EQ (−X) and scenario measures Pi . The more general case of convex risk measures is discussed in [7, 9]. There is a naturally associated equilibrium condition coming from similar equilibria conditions in game theory saying that in a balance of supply and demand it is not possible to lower some risks without increasing others. In formal terms this condition is formulated as: n Xi = 0 and i (Xi ) ≤ 0, ∀i, then i (Xi ) = 0, ∀i. (E) If Xi ∈ L∞ (P ) satisfy i=1
To investigate this equilibrium condition we introduce two naturally associated risk measures to the risk allocation problem. The first one is Ψ (X) = inf{m : X + m ∈ A},
(19)
with A the closed 9n cone generated by the union of the acceptance sets Ai of i , A = cone( i=1 Ai ). W.r.t. Ψ every risk is acceptable, which is acceptable to any one of the traders in the market. Thus Ψ corresponds to some kind of optimistic view towards risk. The second related risk measure is the infimal convolution ˆ = 1 ∧ · · · ∧n ˆ(X) = inf
n & i=1
i (Xi );
n
' Xi = X ,
(20)
i=1
which describes the optimal reachable total risk of an allocation. Both risk measures have been considered in the literature (see [12, 2]). It turns out (see [7]) that ˆ is a coherent risk measure ⇔ ˆ(0) = 0 (21) ⇔ The equilibrium condition (E) holds true ⇔ Ψ is a coherent risk measure
160
L. R¨ uschendorf
and in this case ˆ = Ψ and the scenario set P ∼ ˆ satisfies P = Pˆ = PΨ =
n :
Pi .
(22)
i=1
As consequence one obtains an interesting result of Heath and Ku (2004) (derived there for finite spaces Ω) saying: The equilibrium condition (E) is equivalent to n : Pi = ∅, (23) i=1
i.e. to the existence of a common scenario measure of all traders. In particular (21) implies that the optimal risk allocation problem makes sense only under the equilibrium condition (E). Without (E) it is not possible to determine Pareto optimal allocation rules or allocation rules which minimize the total risk and a natural question is what to do in case the equilibrium condition does not hold true. To consider a useful version of the optimal risk allocation problem we define for X ∈ L∞ (P ) n ' & Xi , (Xi ) admissible , A(X) = (Xi ); X =
(24)
i=1
where (Xi ) is called an admissible allocation of X if X(ω) ≥ 0 ⇒ Xi (ω) ≥ 0 X(ω) ≤ 0 ⇒ Xi (ω) ≤ 0.
(25)
The idea of introducing restrictions as above on the class of decompositions is similar to portfolio optimization theory, where restrictions on the trading strategies are introduced in order to prevent doubling strategies and thus to prevent the possibility of arbitrage. In the risk sharing problem we want to prevent risk arbitrage by restricting the class of admissible allocations. We define the admissible infimal convolution ∗ by ∗ (X) = inf
n &
' i (Xi ); (Xi ) ∈ A(X) .
(26)
i=1
Considering the connection with multiple decision problems and using a nonconvex version of the minimax theorem we get the following dual representation of ∗ , which essentially simplifies the calculation (see Burgert and R¨ uschendorf (2005)). Let X− , X+ denote the negative (positive) parts of ; < X and Pj , Pj denote the lattice supremum resp.; infimum
Risk Measures for Portfolio Vectors and Allocation of Risks
161
Theorem 5. For coherent risk measures i = Pi holds & ' = > a) ∗ (X) = sup X− d Pj − X+ d Pj ; Pj ∈ Pj , 1 ≤ j ≤ n ' & = > b) A∗ = X ∈ L∞ (P ); X+ d Pj ≤ X+ d Pj , ∀Pj ∈ Pj . The choice of restrictions in the definition of admissibility is justified by the following theorem which is based on Theorem 5. Theorem 6 (see [7]). Define the coherent admissible infimal convolution ˆ∗ (X) = inf{m ∈ R; X + m ∈ A∗ } = inf{m ∈ R; ∗ (X + m) ≤ 0}. a) Under the equilibrium condition (E) holds ˆ∗ = ˆ = Ψ . b) ˆ∗ is the largest coherent risk measure ≤ mini i . Part b) says that our chosen restrictions on decompositions are not too restrictive since as a result of them we get the largest possible coherent risk measure below i . Several related classes of restrictions can be given which lead to the same coherent risk measure. In particular we get a new useful coherent risk measure describing the value of the total risk of the optimal modified risk allocation problem. A different new type of restrictions on the allocation problem has been introduced in a recent paper by Filipovic and Kupper (2006) who consider n C for a given risk allocation X = i=1 i as admissible risk transfers only allocations of the form X=
n
Xi with Xi = Ci + xi · Z,
(27)
i=1
where Z = (Z1 , . . . , Zd ) is a finite vector of d fixed random instruments in the n market, xi ∈ Rd are admissible allocation vectors such that i=1 xi · Z ≤ 0. Thus the optimal restricted risk allocation problem n i=1
i (Ci + xi · Z) =
inf
xi admissible
(28)
leads to an optimization problem with vector valued variables x1 , . . . , xn ∈ Rd and methods from game theory can be applied to characterize optimal solutions. Problem (28) can be seen as a variant of the classical portfolio optimization problem, i.e. to minimize the risk (x·Z) over all portfolio vectors n x = (x1 , . . . , xd ), xi ≥ 0, i=1 xi = 1. There is an alternative related form of the risk allocation problem which may be called the capital allocation problem (see [12, Chapter 9]). For a firm with N trading units there are expected future wealth X1 , . . . , XN ∈
162
L. R¨ uschendorf
N L∞ (P ). If risk is measured by a risk measured , then k = ( i=1 Xi ) is the necessary capital the firm needs to cover the total risk. The problem is to find a fair allocation of the risk capital k = k1 + · · · + kN to the N trading units. Alternatively for subadditive risk measures nthis as the problem n one can see to distribute the gain of diversification i=1 (Xi ) − ( i=1 Xi ) ≥ 0 over the different business units of a financial institution. An allocation k1 , . . . , kN of the diversification gain is called fair if N
N , + ki = Xi
i=1
(29)
i=1
and for all J ⊂ {1, . . . , N } holds + , kj ≤ Xj . j∈J
(30)
j∈J
The existence of fair allocations (Bondarava–Shapley theorem for risk measures) is proved in Delbaen (2000) [12, Theorem 22] for coherent risk measures. Assuming continuity of from below (see [15, p. 167]) we get a simple proof of this existence result and more information on the fair allocation. Let P (see [15, p. 165]) denote the maximal representation set of scenario measures in the representation of . Theorem 7. Let be a coherent risk measure continuous from below and let N X1 , . . . , XN be N wealth variables with k = ( i=1 Xi ). Then there exists ∗ with ki∗ := EQ∗ (−Xi ) is some scenario measure Q∗ ∈ P such that k1∗ , . . . , kN a fair allocation of the risk capital k. Proof. By the representation of we have N N , + , + k= Xi = sup EQ − Xi . i=1
Q∈P
(31)
i=1
Using that is continuous from below Corollary 4.35 of F¨ ollmer and Schied (2004) implies the existence of some Q∗ ∈ P such that the supremum in (31) is attained in Q∗ and with ki∗ = EQ∗ (−Xi ) holds N N + , k = EQ∗ − Xi = ki∗ . i=1
i=1
Further for any J ⊂ {1, . . . , N } holds + , + Xj ) ≥ EQ∗ − Xj = kj∗ . j∈J
j∈J
j∈J
∗ Thus k1∗ , . . . , kN is a fair allocation of the risk capital.
(32)
Risk Measures for Portfolio Vectors and Allocation of Risks
163
References [1] P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent measures of risk. Finance and Stochastics, 9:203–228, 1998. [2] P. Barrieu and N. El Karoui. Inf-convolution of risk measures and optimal risk transfer. Finance and Stochastics, 9:269–298, 2005. [3] K. Borch. Reciprocal reinsurance treaties. ASTIN Bulletin, 1:170–191, 1960a. [4] K. Borch. The safety loading of reinsurance premiums. Skand. Aktuarietidskr., 1:163–184, 1960b. [5] K. Borch. Equilibrium in a reinsurance market. Econometrica, 30: 424–444, 1962. [6] H. B¨ uhlmann and W. S. Jewell. Optimal risk exchanges. ASTIN Bulletin, 10:243–263, 1979. [7] C. Burgert and L. R¨ uschendorf. Allocations of risks and equilibrium in markets with finitely many traders. Preprint, University Freiburg, 2005. [8] C. Burgert and L. R¨ uschendorf. Consistent risk measures for portfolio vectors. Insurance: Mathematics and Economics, 38:289–297, 2006. [9] C. Burgert and L. R¨ uschendorf. On the optimal risk allocation problem. Statistics & Decisions, 24(1), 2006, 153–172. [10] E. Chevallier and H. H. M¨ uller. Risk allocation in capital markets: Portfolio insurance tactical asset allocation and collar strategies. ASTIN Bulletin, 24:5–18, 1994. [11] C. Christofides and E. Vaggelatou. A connection between supermodular ordering and positive, negative association. Journal Multivariate Analysis, 88:138–151, 2004. [12] F. Delbaen. Coherent risk measures. Cattedra Galileiana. Scuola Normale Superiore, Classe di Scienze, Pisa, 2000. [13] F. Delbaen. Coherent risk measures on general probability spaces. In Klaus Sandmann et al., editors, Advances in Finance and Stochastics. Essays in Honour of Dieter Sondermann, pages 1–37. Springer, 2002. [14] D. Filipovic and M. Kupper. Optimal capital and risk transfers for group diversification. Preprint, 2006. [15] H. F¨ ollmer and A. Schied. Stochastic Finance. de Gruyter, 2nd edition, 2004. [16] D. Heath and H. Ku. Pareto equilibria with coherent measures of risk. Mathematical Finance, 14:163–172, 2004. [17] E. Jouini, M. Meddeb, and N. Touzi. Vector-valued coherent risk measures. Finance and Stochastics, 4:531–552, 2004. [18] S. Kusuoka. On law-invariant coherent risk measures. Advances in Mathematical Economics, 3:83–95, 2001. [19] I. Meilijson and A. Nadas. Convex majorization with an application to the length of critical paths. Journal of Applied Probability, 16:671–677, 1979.
164
L. R¨ uschendorf
[20] D. M¨ uller and D. Stoyan. Comparison Methods for Stochastic Models and Risks. Wiley, 2002. [21] Risk Measures and Their Applications. Special volume, L. R¨ uschendorf (ed.). Statistics & Decisions, vol. 24(1), 2006. [22] L. R¨ uschendorf. Inequalities for the expectation of ∆-monotone functions. Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und verwandte Gebiete, 54:341–349, 1980. [23] L. R¨ uschendorf. Solution of statistical optimization problem by rearrangement methods. Metrika, 30:55–61, 1983. [24] L. R¨ uschendorf. Comparison of multivariate risks and positive dependence. J. Appl. Probab., 41:391–406, 2004. [25] L. R¨ uschendorf. Stochastic ordering of risks, influence of dependence and a.s. constructions. In N. Balakrishnan, I. G. Bairamov, and O. L. Gebizlioglu, editors, Advances in Models, Characterizations and Applications, volume 180 of Statistics: Textbooks and Monographs, pages 19–56. CRC Press, 2005. [26] L. R¨ uschendorf. Law invariant risk measures for portfolio vectors. Statistics & Decisions, 24(1), 2006, 97–108. [27] L. R¨ uschendorf and S. T. Rachev. A characterization of random variables with minimum L2 -distance. Journal of Multivariate Analysis, 1:48–54, 1990. [28] A. Schied. On the Neyman–Person problem for law invariant risk measures and robust utility functionals. Ann. Appl. Prob., 3:1398–1423, 2004. [29] A. H. Tchen. Inequalities for distributions with given marginals. Ann. Prob., 8:814–827, 1980.
The Road to Hedge Fund Replication: The Very First Steps Lars Jaeger Partners Group, Baar/Zug, Switzerland, [email protected]
1 Introduction The debate on sources of hedge fund returns is one of the subjects creating the most heated discussion within the hedge fund industry. The industry thereby appears to be split in two camps: Following results of substantial research, the proponents on the one side claim that the essential part of hedge fund returns come from the funds’ exposure to systematic risks, i.e. comes from their betas. Conversely, the “alpha protagonists” argue that hedge fund returns depend mostly on the specific skill of the hedge fund managers, a claim that they express in characterising the hedge fund industry as an “absolute return” or “alpha generation” industry. As usual, the truth is likely to fall within the two extremes. Based on an increasing amount of empirical evidence, we can identify hedge fund returns as a (time-varying) mixture of both, systematic risk exposures (beta) and skill based absolute returns (alpha). However, the fundamental question is: How much is beta, and how much is alpha? There is no consensus definition of ‘alpha’, and correspondingly there is no consensus model in the hedge fund industry for directly describing the alpha part of hedge fund returns. We define alpha as the part of the return that cannot be explained by the exposure to systematic risk factors in the global capital markets and is thus the return part that stems from the unique ability and skill set of the hedge fund manager. There is more agreement in modeling the beta returns, i.e. the systematic risk exposures of hedge funds, which will give us a starting point for decomposition of hedge fund returns into ‘alpha’ and ‘beta’ components. We begin with stating the obvious: It is generally not easy to isolate the alpha from the beta in any active investment strategy. But for hedge funds it is not just difficult to separate the two, it is already quite troublesome to distinguish them. We are simply not in a position to give the precise breakdown yet. In other words, the current excitement about hedge funds has not yet been subject to the necessary amount and depth of academic scrutiny. However, we argue that the better part of the confusion around hedge fund returns arises from the inability of conventional
166
L. Jaeger
risk measures and theories to properly measure the diverse risk factors of hedge funds. This is why only recently progress in academic research has started to provide us with a better idea about the different systematic risk exposures of hedge funds and thus give us more precise insights into their return sources.1 Academic research and investors alike begin to realize that that the “search of alpha” must begin with the “understanding of beta,” the latter constituting an important – if not the most important - source of hedge fund returns.2 However, at the same time we are starting to realize that hedge fund beta is different from traditional beta. While both are the result of exposures to systematic risks in the global capital markets hedge fund beta is more complex than traditional beta. Some investors can live with a rather simple but illustrative scheme suggested by C. Asness3 : If the specific return is available only to a handful investors and the scheme of extracting it cannot be simply specified by a systematic process, then it is most likely real alpha. If it can be specified in a systematic way, but it involves non-conventional techniques such as short selling, leverage and the use of derivatives (techniques which are often used to specifically characterize hedge funds), then it is possibly beta, however in an alternative form, which we will refer to as “alternative beta.” In the hedge fund industry “alternative beta” is often sold as alpha, but is not real alpha as defined here (and elsewhere). If finally extracting the returns does not require any of these special “hedge fund techniques” but rather “long only investing,” then it is “traditional beta.” But how do we model hedge fund returns explicitly and break them down into alpha, alternative beta and traditional beta? Ultimately, what we are looking for is a is a general equilibrium model, which relates hedge fund returns to their systematic risk exposures represented by directly observable market prices in the financial markets, similar to the Capital Asset Pricing Model for the equity markets.4 This model does not exist yet in its entirety, but there exists today a growing amount of academic literature on systematic risk factors and hedge funds’ exposure to them (i.e. their factor loadings), including a variety of “alternative beta factors.” We acknowledge that the quality of the offered model differs strongly for the different hedge fund strategy 1 2
3 4
See the recently published book by Jaeger (2005) and references therein. Martin (2004) makes the pertinent point that measures of alpha inextricably depend on the definition of benchmarks or beta components, going on to identify ways in which techniques for measuring ‘alpha’ in a traditional asset management environment are inappropriate or otherwise undermined by the specific characteristics of hedge fund exposures. Moreover, most techniques for measuring hedge fund alpha tend to reward fund managers for model and benchmark misspecification, as imperfect specification of benchmark or ‘beta’ exposure tends to inflate alpha. Asness (2004). While the CAPM is considered “dead” by most academics, there are extension of it in various forms that continue to be subject of research. Further the CAPM is still in extensive use by practioners.
The Road to Hedge Fund Replication
167
sectors. In other words, there is a variable degree of explanatory power for (the variation of) hedge fund returns that factor models can offer across different strategy sectors. While Long/Short Equity has been well modeled in academic research,5 models for some other strategies like Arbitrage strategies (Equity Market Neutral, Convertible Arbitrage) display rather limited explanatory power (i.e. low R-squared values). This article aims to give reference to this academic effort and provide a coherent discussion on the current status of “beta vs. alpha” controversy in the hedge fund industry. Literature references are given extensively. However, it goes further than what has been discussed in most academic papers in that it describes some of the implications we can draw from recognizing that there is likely more beta than alpha in hedge funds. We will discuss the possibility and reality of constructing passive, investable hedge fund indices thereof, and finally provide some remarks on the controversy of the future investment capacity for hedge funds. The article is structured as follows: The first part gives a review of the structure of the currently available return factor models for hedge funds. The second part discusses the problems and pitfalls of hedge fund indices, before the third and fourth part provides some concrete asset based factor models for the various hedge fund strategy sectors. The fifth part discusses how one can construct real benchmarks and possibly passive and investable hedge fund indices. The subsequent two sections discuss the future of hedge funds alphas and the entire industry’s investment capacity, before we provide some concluding remarks.
2 Factor Models for Hedge Fund Strategies: Revisiting Sharpe’s Approach In 1992 W. Sharpe introduced a unifying framework for such style models in an effort to describe active management strategies in equity mutual funds.6 In his model, he describes a certain active investment style as a linear combination of a set of asset class indices. In other words, an active investment strategy is a linear combination of passive, i.e. long-only, buy-and-hold, strategies. The models Sharpe introduced are successful in explaining the lion’s share of the performance of mutual funds. 5
6
W. Fung, D. Hsieh, “Extracting Portable Alpha from Equity Long/Short Hedge Funds” (2004), See “Asset Allocation: Management Style and Performance Measurement” (1992) by William Sharpe and the articles by Eugene Fama and Kenneth French “Multifactor explanations of Asset Pricing Anomalies” (1993) and “Common risk factors in the return of stocks and bonds” (1993). More information can also be found at the websites of William Sharpe, www.wsharpe.com, and Ken French, http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/.
168
L. Jaeger
Fung and Hsieh were the first to extend Sharpe’s model to hedge funds in 1997.7 They employed techniques similar to those Sharpe had applied to mutual funds five years earlier, but introduced short selling, leverage and derivatives – three important techniques employed by hedge funds - into their model. The resulting factor equation would account for all hedge fund return variation that derives from risk exposure to the risk factors of various asset classes. Adding alpha to the equation, it allows us to decompose hedge fund return as: Hedge fund excess return = Manager’s alpha + Σ (βi * Factori ) + random fluctuations Fung and Hsieh performed multifactor regressions of hedge fund returns on eight asset class indices: US equities, non-US equities, emerging market equities, US government bonds, non-US government bonds, one-month Eurodollar deposit rate, gold, and the trade-weighted value of the US dollar. They identified five risk factors (referred to as style factors), which they defined as modelling Global Macro, Systematic Trend-Following, Systematic Opportunistic, Value, Distressed Securities. They further argued that hedge fund strategies are highly dynamic and create option-like, non-linear, contingent return profiles. These non-linear profiles, they argued, cannot be modelled in simple asset class factor models. In their later research they explicitly incorporate assets with contingent payout profiles, e.g. options.8 Most of the studies which have followed show results consistent with Fung and Hsieh.9 The recent literature offers an increasing number of studies around the question of common style factor exposure and contingency in payoff profile for hedge funds.10 As the formula above describes, we infer the hedge funds’ alphas by measuring and subtracting out the betas times the beta factors. We can look at alpha as the “dark matter” of the hedge fund universe. It can only be measured by separating everything else out and seeing what is left. In other words, alpha is never directly observable, but is measured jointly with beta. It can only be indirectly quantified by separating the beta components out. The obtained value of alpha therefore depends on the chosen risk factors. If we leave out a relevant factor in the model, the alpha will come out as fictively high. To draw another analogy, we can equally say that alpha is the garbage bag of the regression: We account for everything we can, and whatever is left gets 7 8
9
10
Fung, W., Hsieh, D., (1997). The idea of option factors for the purpose of hedge fund modeling was already introduced in the earliest work on hedge fund models by W. Fung and D. Hsieh, (1997), and was since then discussed by many academic studies. See their recent work: W. Fung, D. Hsieh. (2002). See e.g. the article by S. Brown and W. Goetzmann, (2003). The authors identify eight style factors, i.e. three more than Fung and Hsieh in their research. See W. Fung, D. Hsieh, (2003); (2001); (2001); (2002); V. Agarwal, N. Naik, (2000); D. Capocci, G. H¨ ubner, (2004).
The Road to Hedge Fund Replication
169
put into alpha. As a consequence, some of the returns not accounted for by these models are unaccounted beta rather than alpha. Surely, an incomplete model of systematic risk factors doesn’t mean those additional risk factors do not exist; only that we do not yet know how to model them. To draw another image from astronomy, the outer planets of our solar system existed and exerted their gravitational pull long before we had telescopes sensitive enough to see them. Therefore the formula above on hedge fund returns should actually read as follows: Hedge fund return=Manager’s alpha +Σ (βi * Factori(modelled) ) + Σ (βi * Factori(unmodelled ) + random fluctuations. A simple example illustrates the problem: Consider a put writing strategy on the S&P 500, or equivalently a covered call writing strategy, as e.g. represented by the Chicago Board of Trade’s BXM index. To be precise, we write monthly at-the-money call options on existing equity positions with one month maturities. On regressing the BXM index against the S&P 500 over a period of 11 years from 1994 to 2004 we obtain a statistically significant alpha (i.e. a y-intercept of the regression) of around 0.4% per month, or almost 5% p.a. There is surely not much true skill driven alpha in writing put options on equities.11 All or most of the 0.4% is what we refer to as spurious or “phantom” alpha, which results from the imperfect specification of the chosen model (regression against the S&P 500). So we should not confuse pure manager skill with an imperfect model. This is a common problem of multi-factor models in the literature which claim to proof high alphas. We must therefore always take any statistics of alpha with a grain of salt.
3 The Problem with Hedge Fund Indices There is some more bad news for alpha: Hedge fund databases and thus the indices constructed thereof are subject to various biases which make their returns and thus the obtained alpha in a regression analysis based on these indices look bigger than they really are.12 The lack of transparency and uniform reporting standards in the hedge fund industry are disreputable sources of measurement errors that plague any hedge fund performance analysis. The most important of these are the survivorship and the backfilling bias. The consensus view of studies on this subject is that these effects account for at least 3-4% of the reported hedge fund out-performance. A recent study by B. Malkiel and A. Saha gives an idea about the performance upward biases in hedge fund indices.13 11
12
13
Writing put options and investing the collateral in cash is identical to writing covered calls, a property that is known as “put call parity” in option theory. See the discussion in chap. 7 and Chap. 9 in L. Jaeger “Through the Alpha Smoke Screens: A Guide to Hedge Fund Return Sources” (2005). B. Malkiel, A. Saha, “Hedge Funds: Risk and Return,” Working Paper (2004).
170
L. Jaeger
There is in fact little widely published data on historical hedge fund performance, so industry analysis relies mostly on aggregated returns as provided by a dozen of different index providers which differentiate hedge fund performance across the various strategy sectors. Although these indices constitute an important tool for comparison and possibly benchmarking within and outside the hedge fund industry, measuring manager performance, classifying investment styles, and generally creating a higher degree of transparency in this still rather opaque hedge industry, the results of these efforts vary significantly between providers and depend more on “committee decisions” regarding index construction criteria - such as asset weighting, fund selection and chosen statistical adjustments - than on objectively determined rules. Although this is also somewhat of a problem in traditional asset class indices, it is severely exacerbated in the hedge fund space by the diverse, dynamic and opaque nature of the hedge fund universe. However, the built-in flaws of existing indices have as much to do with the built-in complexities of hedge funds as with any fault of the index developers. It is simply more difficult to create unambiguous index construction guidelines for the heterogeneous hedge fund universe. In particular, while the construction of traditional asset class indices rests on the reasonably well founded assumptions that the underlying assets are homogenous, and that the investor follows a “buy and hold” strategy, hedge funds are diverse and subject to dynamic change. In traditional asset classes, the average return of the underlying securities in an index has a strong theoretical basis. It is constructed to be the return of the “market portfolio,” which is the assetweighted combination of all investable assets in that class or a representative proxy thereof. According to asset pricing theory – e.g. Sharpe’s Capital Asset Pricing Models (CAPM) - this market portfolio represents exactly the combination of assets with the optimal risk-return trade-off in market equilibrium. It is therefore not surprising that traditional equity indices became vehicles for passive investment only after the development of a clear theoretical foundation in the form of the CAPM.14 Traditional indices are designed to capture directly a clearly defined risk premium available to investors willing to expose themselves to the systematic risk of the asset class. So an investor in the S&P 14
It is worth noting here that equity indices remained almost solely performance analysis tools rather than investment vehicles for many years. The first asset weighted index tracker fund (on the S&P index) started in 1973, only about five years after the CAPM became broadly accepted. The very first tracker fund was launched in 1971 and was equally weighted (on the NYSE). The problem with equally weighted indices is that they require constant rebalancing to maintain those weightings, and in the pre-1975 period (i.e. prior to deregulation of stock commissions) such rebalancing was extremely costly. Wells Fargo launched a capweighted tracker fund in 1973 which enabled them to reduce transactions costs. Some argue that the predominance of the S&P500 as a benchmark owes more to the ease of replication than an inherent confidence in the theoretical jusutification for cap-weighting, see Schoenfeld’s book “Active IUndex Investing” (2004)
The Road to Hedge Fund Replication
171
500 index knows exactly what he is getting; broad exposure to the risks and risk premia of the US large cap equities market. In other words, there exists a general equilibrium model in the case of the stock markets. However, such a model is still missing for the asset class hedge funds. The standard way to construct a hedge fund index has so far been to use the average performance of a set of managers15 . However, indices constructed from averaging single hedge funds inherit the errors and problems of the underlying databases. Therefore they face several performance biases that limit the usefulness of the result.16 These biases include (but are not limited to): Survivorship. The survivorship bias is a result of unsuccessful managers leaving the industry, thus removing unsuccessful funds ex post from the representative index. Only their successful counterparts remain; creating a positive bias. In the most extreme case this is like lining up a number of monkeys, let them trade in the markets, take out all those that lost money, and then checking the performance of the rest. The survivors may all be in good shape, but they hardly represent the performance of the entire original group! Many hedge fund databases only provide information on currently operating funds, i.e. funds that have ceased operation are considered uninteresting for the investor and are purged from the database. This leads to an upwards bias in the index performance, since the performance of the disappearing funds is most likely worse than the performance of the surviving funds.17 Consensus estimates about the size of the survivorship bias in hedge fund databases vary from 2% to 4%. We note that hedge fund indices are only subject to this bias to the extent that they are constructed after the fact/inception of the index. Today index providers do not restate index returns on a going forward basis as managers drop in and out of their database. Index users should only use ‘live’ index data rather than all historical pro forma data. Backfilling. A variation of the survivorship bias can occur when a new fund is included into the index and his past performance is added or “backfilled” into the database. This induces another upward bias: New managers enter the database only after a period of good performance, when entry seems most 15
16
17
Indices based on average performance of a set of managers have generally well known pitfalls, already in traditional asset classes. See the article by Jeffrey Bailey “Are Manager Universes Acceptable Performance Benchmarks,” Spring 1992. Most of these issues are well known by practitioners and are discussed in details in Chap. 9 of L. Jaeger “Through the Alpha Smoke Screens.” A good overview of the problems can be found in A. Kohler, “Hedge Fund Indexing: A square Peg in a round hole,” State Street Global Advisors (2003). See also “Hedge Fund Indices” by G. Crowder and L. Hennessee, Journal of Alternative Investments, (2001); “A Review of Alternative Hedge fund Indices.” by Schneeweis Partners (2001); “Welcome to the Dark Side: Hedge Fund Attrition and Survivorship Bias over the Period 1994-2001” by G. Amin et al. (2001). The survivorship bias is also well known in the world of mutual funds, see for example the paper by S. Brown et al., “Survivorship Bias in Performance Studies” (1992).
172
L. Jaeger
attractive. Since fewer managers enter during periods of bad performance, bad performance is rarely backfilled into the averages.18 Again, hedge fund indices are only subject to this bias to the extent that they are constructed after the fact/inception of the index. Selection.Unlike public information used to compose equity and bond indices, hedge fund index providers often rely on hedge fund managers to voluntarily and correctly submit return data on their funds. Hedge fund managers are private investment vehicles and are thus not required to make public disclosure of their activities. Some bluntly refuse to submit data to any index providers. This “self-selection bias” causes significant distortions in the construction of the index and often skews the index towards a certain set of managers and strategies on a going forward basis. Sampling differences produce much of the performance deviation between the different fund indices. Hedge fund indices draw their data from different provider, the largest of which are the TASS, Hedge Fund Research (HFR) and CISDM (formerly MAR) database. These databases have surprisingly few funds in common, as most hedge funds report their data – if at all – only to a subset of the databases. Counting studies have shown that less than one out of three hedge funds in any one database contributes to the reported returns of all major hedge fund indices19 . Autocorrelation. Time lags in the valuation of securities (especially for less liquid strategies like Distressed Securities) held by hedge funds may induce a smoothening of monthly returns which leads to volatility and correlations being significantly underestimated. Statistically this effect expresses itself by significant autocorrelation in hedge fund returns (as will be shown below). Ironically, the theoretical and practical problems described above do not disappear when the index is designed to be investable. Some problems are actually exacerbated. A prerequisite for creating an investment vehicle is that the underlying managers provide sufficient capacity for new investments. This creates a severe selection bias, as hedge funds at full capacity (closed) are a priori not considered in the index. In traditional assets, an investor in the Dow Jones Industrial Average Index does not need to worry that IBM is closed for further investment.20 But for hedge fund indices, capacity with top 18
19
20
R. Ibbotson estimates this bias to account for a total of up to 4% of reported hedge fund performance (Presentation at GAIM conference 2004). See also: Brown, S, Goetzmann, W., Ibbotson, R., “Offshore hedge funds: Survival and performance 1989–1995” (1999). A recent estimate of the backfilling bias is given by B. Malkiel et. al in their paper “Hedge Funds: Risk and Return” (2004) where the backfilling bias is estimated in the same region as by Ibbotson. See the study by W. Fung and D. Hsieh, “Hedge Fund Benchmarks: A Risk Based Approach” (2004) To be more precise, IBM stocks are in fact “closed for further investments” as there are only a finite number of shares available (assuming no capital increase). In this way they actually resemble closed hedge funds. However, any investor who desires can freely purchase IBM shares in the secondary markets (stock markets) due to its high degree of liquidity (that is what stock markets are all about). In this sense the comparison serves us well here.
Ap M r-03 a Ju y-0 n 3 Ju -03 Aul-03 g Se -03 p O -03 c N t-0 ov 3 D -0 e 3 Ja c-0 n 3 Fe -04 M b-0 ar 4 Ap -04 M r-04 a Ju y-0 n 4 Ju -04 Aul-04 Se g-0 p 4 O -04 c N t-04 o D v-0 e 4 Ja c-0 n 4 Fe -05 M b-0 ar 5 Ap -05 M r-05 a Ju y-0 n 5 Ju -05 Aul-05 g05 M a Ap r-0 3 M r-0 a 3 Ju y-0 n- 3 Ju 03 Aul-03 g Se -0 p 3 O -03 c N t-0 ov 3 D -0 e 3 Ja c-0 n 3 Fe -0 4 M b-0 a 4 Ap r-0 4 r M 0 a 4 Ju y-0 n 4 Ju -04 Aul-04 g Se -04 p O -0 c 4 N t-0 o 4 D v-0 e 4 Ja c-0 n 4 Fe -0 5 M b-0 a 5 Ap r-0 5 M r-0 a 5 Ju y-0 n- 5 Ju 05 Aul-05 g05
Ap M r-03 a Ju y-0 n 3 Ju -03 Aul-03 Se g-0 p 3 O -0 c 3 N t-03 o D v-0 e 3 Ja c-0 n 3 Fe -04 M b-0 ar 4 Ap -0 4 M r-04 a Ju y-0 n- 4 Ju 04 Aul-04 Se g-0 p 4 O -04 c N t-04 o D v-0 e 4 Ja c-0 n 4 Fe -05 M b-0 ar 5 Ap -0 5 M r-05 a Ju y-0 n- 5 Ju 05 Aul-05 g05
M a Ap r-0 3 M r-0 a 3 Ju y-0 n 3 Ju -03 Aul-03 Se g-0 p 3 O -03 c N t-0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Aul-04 Se g-0 p 4 O -04 c N t-0 ov 4 D -0 e 4 Ja c-0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-0 a 5 Ju y-0 n- 5 Ju 05 Aul-05 g05
HFRI Equity Hedge
130
120
110
100
145
150 160
108 HFRI Event Driven
HFRI Macro
HFXI Convertible index HFRX Event Driven
140
135
125
115
110
100
115
100
110
HFRX Convertible Arb
106
104
100
98
M a Ap r-0 3 M r-0 a 3 Ju y-0 n- 3 Ju 0 3 Aul-03 Seg-0 p- 3 O 03 c N t-0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Aul-04 Seg-0 4 O p-0 c 4 N t-0 o 4 D v-0 e 4 Ja c-0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-0 a 5 Ju y-0 n- 5 Ju 05 Aul-05 g05
M a Ap r-0 3 M r-0 a 3 Ju y-0 n- 3 Ju 0 3 Aul-03 Seg-0 p- 3 O 03 c N t-0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Aul-04 Seg-0 4 O p-0 c 4 N t-0 ov 4 D -0 e 4 Ja c-0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-0 a 5 Ju y-0 n- 5 Ju 05 Aul-05 g05 140
M a Ap r-0 3 M r-0 a 3 Ju y-0 n- 3 Ju 03 Aul-03 Seg-0 3 O p-0 c 3 N t-0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Aul-04 Seg-0 4 O p-0 c 4 N t-0 o 4 D v-0 e 4 Ja c-0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-0 a 5 Ju y-0 n- 5 Ju 05 Aul-05 g05
M a Ap r-0 3 M r-0 a 3 Ju y-0 n- 3 Ju 03 Aul-03 Seg-0 3 O p-0 c 3 N t-0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Aul-04 Seg-0 4 O p-0 c 4 N t-0 ov 4 D 0 e 4 Ja c-0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-0 a 5 Ju y-0 n- 5 Ju 05 Aul-05 g05
The Road to Hedge Fund Replication
135 HFRX Equity Hedge HFRI Equity Market Neutral
HFRI Distressed
HFRX Macro HFRI Merger Arb
140
96
110
94
105
92
100
HFRI Fund Weighted Index
173
managers is a main issue. There is a clear trade-off between making an index representative and making it investable. Fig. 1 shows the divergence of various Hedge Fund Research investable indices vs. their non-investable counterparts since inception of the former. The deviation is eye-catching: Let us just have a specific look at the Equity Hedge indices. The average monthly underperformance of the HFRX, the investable counterpart of the HFRI index, to the 115
110 HFRX Equity Market Neutral
125
115 105
105 100
95
150 HFRX Distressed Securities
130 140
120 130
120
105 110
100
130
125 120
115 HFRX Merger Arb
120
110
110
105 105
100
135
HFRX Global Index
130
102
125
120
115
Fig. 1. Comparison of cumulative performance for the HFR investable indices vs. their non-investable counterparts since inception of the former. The last graph shows the index referring to global hedge fund industry
174
L. Jaeger
HFRI index is 62 bps, which translates into an average annual underperformance of 7.7%! We conjecture that this is about selection bias in the investable versions of the index more than survivorship bias in the non-investable one. Investable indices depend directly on the services of particular “access providers.” The selection of the index participants is biased towards the access these service providers have to various hedge funds. This “access bias” can lead to a severe distortion in the index. The investment capacity of hedge fund managers (at least those which are actually in a position to provide persistent alpha) is a scarce resource, for which investable index providers must compete with other investors, e.g. funds of funds. An investor in a traditional S&P 500 index fund does not have to worry that stocks in IBM will not be available for purchase. But for an investable hedge fund index, availability of specific funds is indeed an issue (as for any other investor). In such non-public markets as those in which hedge funds do their offering, access is not determined by market price, but by the investors’ ability to get and keep direct access to the individual fund manager. Often this is determined by personal relationships and other “soft factors.” Therefore the distinction between indices and regular fund of funds disappears upon a closer look for most index providers.21 The indices struggle for capacity, must perform due diligence on hedge fund managers, and have similar subjective means to select and assign weights to hedge funds. It is thus not surprising that they often charge similar levels of fees as funds of funds and in almost all cases actually also operate as such. We can essentially identify them as disguise fund of funds that have discovered the marketing value of the “index” label.22 They currently offer neither low fee structures nor the clearly defined risk profiles comparable to a passive index fund in traditional asset classes.23 The true test of whether a hedge fund index is a valid investment vehicle is whether there is a secondary market for hedge funds, whether one can 21
22
23
The distinction between investible index providers and fund of funds is/should be about systematic methodology and goals for manager selection. Most index providers have virtually no selection methodology, and to that extent they are just fund of funds. Those that do have well founded methodologies that are implemented can, without demurring, be called indices. The biggest problem really is that the index provider and the asset manager are in fact identical—this is unlike the case for US Equity Indices, but not unlike the case for the most well regarded bond indices (e.g. Lehman). One important difference between the index provider and a fund of hedge funds remains, though: The fund of funds manager is actively searching for alpha and trading talent, which justifies the comparably high feel level charged. He is not in the business of “averaging the alpha,” an undertaking which almost by construction will lead to lower results in the case of hedge funds. Note that alpha extraction is on a global scale a “zero sum game.” The reader is referred to the following article for another discussion on the problems and pitfalls of hedge fund indices: L. Jaeger, “Hedge Fund Indices – A new way to invest in absolute returns strategies?,” (June 2004).
The Road to Hedge Fund Replication
175
construct derivatives from it and whether it can be sold short. The possibility of short selling and constructing synthetic positions based on derivatives (in a cost efficient way) creates the prospect of arbitrage opportunities using the hedge fund indices. Ironically such arbitrage opportunities would most likely be exercised by hedge funds, in a sort of Klein bottle of investments that contain themselves. Whether or not such trades emerge will eventually prove whether hedge fund indices can sustain market forces, which ultimately enforce an arbitrage-free market equilibrium. Today, there is an active market for structured products referencing hedge fund indices, including delta one products that allow investors to synthetically short some of the investable hedge fund indices.
4 Modelling Hedge Fund Returns: A First Simple Example Figure 2 provides a first insight into how a combination of simple systematic strategies each of which mirrors particular “beta factors” (risk premia) tracks the performance of a multi-strategy hedge fund portfolio. It displays the return of an equally weighted combination of three simple strategies, each tracking different risk premia: 400
350
300
250
200
150
SPX INDEX
HFRIFOF Index
HFRIFWI Index
05
05
7. .0
31
04
31
.0
1.
04
7. .0
31
03
1. .0
31
03
31
.0
7.
02
1. .0
31
02
31
.0
7.
01
1. .0
31
01
7. .0
00
1.
31
7.
.0
.0
31
31
99
00 1.
.0 31
99
7.
1.
.0
.0 31
98
98 7.
.0
31
31
97
1. .0
31
97
31
.0
7.
96
1.
7.
.0
.0 31
95
96 1.
.0
31
31
95
7.
31
.0
1.
7. .0
31
31
.0
94
100
Simple Strategy Combination
Fig. 2. Performance of an equally weighed combination of three strategies: the sgfii trend following index, the BXM covered call writing index, and long the Credit Suisse High Yield Bond Index (annualised return: 10.3%, annualised volatility: 5.6%). For comparison, we show the performance of the HFR Composite (annualised return: 11.7%, annualised volatility: 7.2%), the HFR Fund of Funds Index (annualised return: 7.9%, annualised volatility: 5.8%) and the S&P 500
176
L. Jaeger
1. A simple trend following model on 25 liquid futures markets summarized on what is known as the “sgfi index” (Bloomberg ticker “SGFII
5 Regression of Hedge Fund Returns on Systematic Risk Factors In the following we perform some modelling efforts of hedge fund strategies based on various regressions on systematic risk factors. For the lack of better data we must hereby rely on the publicly available hedge fund indices despite their shortcomings mentioned above. One might suggest that a better choice would be to perform the analysis on the investable indices as these do not come with these upward biases. However, as discussed above, these often lack the necessary degree of representativness due to their own selection biases. Furthermore, their history is too short to perform a meaningful regression. And we claim that non-investable hedge fund indices themselves serve better 24
25
See L. Jaeger et al., “Case study: The sGFI Futures Index”, Journal of Alternative Investments, (Summer 2002). “Buy write” refers to holding long the underlying – in this case the S&P 500 index, and simultaneously selling a call. This combination is economically identical to selling a put on the S&P500 plus holding an equivalent amount of cash.
The Road to Hedge Fund Replication
177
1
2
3
−2
−1
0
1
Small-Cap Spread Equity Hedge Manager
2
6
8
Percent
6
4
4
Percent
2
2 0
0
0
Convertible Equity Hedge Manager
8
−1
−1
−.5
0
.5
CPPI Equity Hedge Manager
1
0
0
2
2
4
4
6
Percent
Percent
6
8
8
10
as the dependent variables in a risk factor analysis as it seems at first sight. Their discussed short comings refer mostly to the absolute level of performance and not to their risk characteristics. While non-investable indices fail when used as absolute performance measures, they may very well do their service when it comes to describing the typical risk exposure characteristics of the diverse strategies.26 In other words, the biases such as survivorship and backfilling bias have their effects mostly on the y-intercept, i.e. the alpha, and less so on the sensitivities, i.e. the betas, of the regression. In order to illustrate this statement, we performed an analysis identical to the one above on extended sets of individual managers as provided by the TASS database. The thus obtained R-squares can be expected to be much lower due to the heterogeneity of hedge managers even within the same sector,27 but the obtained average values for the sensitivities are generally quite similar. Figure 3 illustrates this for the case of Long/Short Equity managers, where we display the histograms of the obtained factor sensitivities in our regression analysis for 483 Long/Short Equity managers in the period form 1998 to 2004. These results should be compared to the results in the first row in the following Table 1. Table 1 summarizes the results of a multifactor regression on the various hedge fund strategy sector indices provided by the data provider Hedge
−.5
0
AR(1) Equity Hedge Manager
.5
Fig. 3. Histogram of the factor exposures (“betas”) of Long/Short Equity managers using the independent variable as in Table 1. Data: Tass 26
27
Which is actually what linear regression models do, they explain variance, not absolute return. This is already the case when performing a regression on individual stocks. The reason is simply idiosyncratic risk!
178
L. Jaeger Table 1. Alternative factors
Fund Research (HFR). Returns are calculated on monthly data as geometric averages (cumulative returns) of the log-differences of consecutive (monthly) prices. Further the risk free rate of return was explicitly subtracted from all independent as well as the dependent variables, evidently with the exception of spread factors (as a risk free rate we chose US 3 month Libor). Note that the regression models include the AR(1) factor (the autocorrelation factor, which
The Road to Hedge Fund Replication
179
is the one-month lagged time series of the dependent variable) as independent variable where significant. The reason for this is simply that lagged marking of asset in several hedge fund strategies prices do not adjust instantly to changing prices of the underlying instruments but with a delay, either because the underlying markets they trade in are less liquid or because they want to smooth their reported returns over time, or, as been hypothesized elsewhere, active smoothing of returns by hedge fund managers.28 Overall, the set of factors captures a large percentage of the hedge fund return characteristics, which expresses itself in the high R2 values taking a value of 60% on average. But at the same time this means that although we can explain a substantial part of the variation of hedge fund returns by these factor models, a substantial part is still missing. Furthermore, the regressions are much more successful at explaining some hedge fund strategies than others. They do well at explaining Long/Short Equity, Short Selling, and Event Driven strategies. On the other hand, they do a poorer job with the strategies Equity Market Neutral, Merger Arbitrage, and Managed Futures. We realize that hedge funds earn a substantial part of their returns by taking systematic risks that our statistical methods allow us to measure. But the nature of these risks often diverges from the standard notion of systematic (broad market) risk. In the case of equity risk factors, it is often small cap risk (Russell 2000), non-linear risk (convertible bonds, BXM), or default risk (high yield, emerging markets) rather than the risk of the overall stock market. In the case of bond market risks, it is specifically credit risk that is assumed by many hedge funds (Event Driven, Distressed Debt, Fixed Income Arbitrage, Convertible Arbitrage). Note the significance of the autoregressive term AR(1) in the regression in five out of ten strategies. We can interpret the autocorrelation shown in the results as a sign of persistent price lags in the valuation of hedge funds. This implies that simple measures of risk like Sharpe ratio, volatility, correlation with market indices etc. significantly underestimate the true market risk in hedge fund strategies. Indeed positive autocorrelation has two effects: it drives down estimated volatility and it means that suddenly changing market conditions and shocks – as measured by the risk factors – distribute over several periods. The AR(1) factors thus measures some lagged beta. Excluding this factor would cause some unaccounted beta to be misinterpreted as alpha. The regression results discussed here above merit a more detailed look at some of the statistics we obtained, specifically on the stability of our models, a subject which is surprisingly little covered in the literature. For this purpose we performed a CUSUM test which is designed to test whether the obtained regression models are stable to any statistically significant degree. 28
A thorough discussion of the autoregressive factor can be found in Getmansky M., Lo, A. W., Makarov, I., “An Econometric Model of Serial Correlation and Illiquidity in Hedge Fund Returns” (2004). See also the paper by C. Asness et al, “Do hedge funds hedge” (2001).
180
L. Jaeger
The CUSUM test considers the cumulated sum of the (normalized) recursive residuals wr. r=t ωr , Wt = σ ˆ r=K+1
(where the denominator displays the predicted standard deviation of the error term of the regression). In order to perform the test Wt is plotted as a function of the time variable t. The null hypothesis of model stability can be rejected when Wt breaks the straight lines passing through the point (K,+/−a(T−K)1/2 ) and (K,+/−3a(T-K)1/2 ) where a is a parameter dependent on the chosen level of significance. Figure 4 displays the cumulated residuals for all models. We observe that for none of our models do the cumulated residuals Wt break the confidence levels. Therefore the null hypothesis of model stability cannot be rejected for any of our models. A second test for model stability is to plot the obtained factor sensitivities over time in a rolling regression. We equally performed this analysis, and results equally indicate a generally high degree of stability of these factors. Figure 5 shows the results for all our strategies.
−20 −40
40
−20 −40
CUSUM 0 20
−20 −40
40
Event Driven
CUSUM 0 20
CUSUM Stability Test
Short Selling
40
CUSUM Stability Test
Equity Hedge
CUSUM 0 20
CUSUM Stability Test
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
−20 −40
40
−20 −40
CUSUM 0 20
−20 −40
40
Equity Market Neutral
CUSUM 0 20
CUSUM Stability Test
Macro
40
CUSUM Stability Test
Distressed
CUSUM 0 20
CUSUM Stability Test
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
−20 −40
40
−20 −40
CUSUM 0 20
−20 −40
40
Convertible Arbitrage
CUSUM 0 20
CUSUM Stability Test
Fixed Income
40
CUSUM Stability Test
Merger Arbitrage
CUSUM 0 20
CUSUM Stability Test
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 Time
Fig. 4. Results of a CUSUM stability test for the regression models in Table 1
182
L. Jaeger
6 Mimicking Hedge Fund Strategies: Can We Create Better Indices? The obvious question arises: Can we use the insights given by the models and the factor exposure discussed above to create better benchmarks? These would aim at mimicking the particular hedge fund strategies, and possibly constitute investable alternatives to the currently offered hedge fund indices (a provocative thought which we already hinted at in Fig. 2). The very goal would be accurately separate systematic risk exposure from true manager alpha. The former constitutes what an index is all about while the latter by definition should not be part of an index/benchmark. The idea of using strategy replications to model hedge fund returns in a factor model setting was developed in a paper by Fung and Hsieh in 2001 for Managed Futures strategies.29 Fung and Hsieh modelled the performance of a generic trend-following strategy using look-back straddles. Since then they and others have applied this type of modelling to a variety of other hedge fund styles,30 including Merger Arbitrage,31 Fixed Income Arbitrage,32 and Long Short Equity.33 The hedge fund firm Bridgewater, for example, has conducted some simple but interesting research along these lines.34 In most of these studies the authors used simple trading strategies for modelling Managed Futures, Long/Short Equity, Merger Arbitrage, Fixed Income Arbitrage, Distressed Securities, Emerging Markets, and Short Selling strategies and generally reached good correspondence with the broadly used hedge fund sub-indices of the corresponding strategy sector. In the following we calculate the performance of a strategy which invests directly into the factor exposures taken from the regression, i.e. we explicitly calculate the cumulative returns Return(t)=Σ (βi * Factori (t)). The factors chosen for this analysis are the same as in the regression above. We refer to these returns as the “Replicating Factor Strategy” returns (in the following referred to as simply “RFS” returns) and compare them to the realized returns displayed by the corresponding hedge fund indices. In order 29
30
31 32 33
34
See W. Fung, D. Hsieh, “The Risk in Hedge Fund Strategies: Theory and Evidence from Trend-Followers” (2001). See W. Fung, D. Hsieh, “The Risk in Hedge Fund Strategies: Alternative Alphas and Alternative Betas” in L. Jaeger (ed.), “The new generation of risk management for hedge funds and private equity investment” (2003). M. Mitchel, T., Pulvino, “Characteristics of Risk in Risk Arbitrage” (2001). W. Fung, D. Hsieh, “The Risk in Fixed Income Hedge Fund Styles” (2002). W. Fung, D. Hsieh, “The Risk in Long/Short Equity Hedge Funds” (2004); V. Agarwal, N. Naik, “Performance Evaluation of Hedge Funds with OptionBased and Buy-and-Hold Strategies” (2003). See the publication by G. Jensen and J. Rotenberg “Hedge Funds Selling Beta as Alpha” (2003).
The Road to Hedge Fund Replication
183
to avoid the problem of data mining and in-sample over-fitting, the factors chosen for the RFS were calculated on a rolling looking forward basis. To be precise, the RFS returns in a given month were calculated using factors obtained by a regression over data for the previous five years ending with the previous month. The RFS are in spirit similar to what Jensen et al.35 describe as a generic replication of hedge fund strategy with the difference however that the chosen factors/substrategies are explicitly modelled in the regression set up. The results for the most recent three years (since inception of the investable indices) are rather astonishing: The cumulative replicating strategy’s returns are often superior to the returns of the hedge fund indices, especially when considering their investable versions. For the latter performance of the RFS is better for every single strategy sector with the exception of the Distressed strategy. Interpreting our results leads us to a schematic illustration of where hedge fund returns come from (Fig. 6). A long-only manager (represented by the left bar) has two sources of returns: the market exposure and the manager excess return, his “alpha” (which is negative for most managers in this domain). The difference between long-only investing and hedge funds is largely that the hedge fund will hedge away all or part of the broad market exposure. In order to achieve this risk reduction, the hedge fund manager employs a variety of Active long-only bonds / equity fund
Hedge Fund Manager Skill, Alpha (security selection, timing, execution)
Alpha
FX Rate Risk Event Risk Convergence Risk Commodity Risk
Hedge Market Risk
Alpha
Leverage
Complexity Risk Short Option Risk Liquidity Risk Small Firm Risk Value Stocks Yield Curve Risk Credit Risk Equity Risk
Fig. 6. A schematic model for hedge fund return sources based on results in Table 1 35
G. Jensen and J. Rotenberg “Hedge Funds Selling Beta as Alpha” (2003), updated in 2004 and 2005.
184
L. Jaeger
techniques and instruments not typically used by the long-only fund manager including short selling and the use of derivatives. This results in what appears as a “pure alpha” product with low expected returns and low expected risk. But in order to be attractive as a stand-alone investment, the hedge fund manager has to conform to the market standard for return. This leads him to scale the risk by using leverage, which provides the desired magnification of return and risk. In this magnified configuration, systematic elements of risk and return that before were hidden in the “Alpha” are suddenly large enough to be analysed separately. In other words, we now have the necessary magnifying glass to separate out the “beta in alpha’s clothing.” We estimate that up to 80% of the returns from hedge funds originate as the result of beta exposure (i.e. exposure to systematic risk factors) with the balance accounting for manager skill based alpha (or not yet identified risk factors). In the following we discuss our results for the individual strategy sectors, the summary of which is presented in Table 2 in comparison with the investable and non investable indices from Hedge Fund Research. 6.1 Long/Short Equity Most Long/Short Equity managers have exposure to both the broad equity market and particularly to small cap stocks. Managers may find it easier to find opportunities in a rising market, and it may also be easier to short sell large cap and buy small cap stocks. Our risk factor model in Table 1 confirms these results. The most significant factors are related to broad equity and small cap equity markets. Fung and Hsieh obtain similar results in a specific study on the Long/Short Equity strategy.36 They choose as independent variables the S&P 500 index and the difference between the Wilshire 1750 index and the Wilshire 750 index as a proxy for the small cap risk factor. We obtained Table 2. Cumulated performance of the RFS and the HFRX strategy, data from March 2003 to August 2005 Strategy Equity Hedge Market Neutral Short Selling Event Driven Distressed Merger Arbitrage Fixed Income Convertible Arbitrage Global Macro Managed Futures 36
RFS 27.8% 6.2% −28.2% 29.8% 20.1% 13.0% 7.8% 7.6% 16.7% 9.2%
HFRX
HFRI
16.0% −3.9% N/A 24.1% 23.3% 10.9% N/A −5.3% 10.1% N/A
32.8% 10.9% −23.0% 40.0% 44.8% 15.3% 16.3% 2.4% 24.6% N/A
See W. Fung, D. Hsieh, “The Risk in Long/Short Equity Hedge Funds” (2004).
The Road to Hedge Fund Replication
185
very similar results (having chosen the Russell 2000 and Russell 1000 for the calculation of the small cap spread). However, a closer look reveals that the exposure of Long/Short Equity hedge funds has a strongly non-linear profile. This non-linear exposure is reflected in the fact that the most explanatory independent variable is a convertible bond index.37 Apparently, this profile models the Long/Short Equity strategy well: Less participation on the upside, protection on the downside to a certain point, but with more expressed losses in a severe downturn of the equity markets (when convertible bonds loose their bond floor). The substitution of an equity factor with a convertible bond factor thus yields a better model than a simple equity factor.38 However there is another equity related factor that comes into play: Hedge funds tend to decrease their exposure in falling equity markets and increase it in rising markets, similar to a “Constant Proportion Portfolio Insurance” strategy often employed in capital protected structures. We simulate this behaviour by including such a CPPI factor based on the rolling 12 month performance of the S&P 500. Figure 7 presents the performance of the RFS next to the HFR non-investable (HFRI) and the investable versions of HFR (HFRX) and S&P indices since inception of the HFRX (inception of the S&P index occurred later, and at its inception it was taken to the same level as the HFRX in the graph). The chart confirms what the numbers already indicated: We can very well replicate the performance of the average Long/Short Equity manager in the index with a RFS model 0.05
140 RFS
HFEH
S&P Long/Short Equity
HFRX
RFS
0.03
130
0.02
125
0.01
120
0
115
−0.01
110
−0.02
105
−0.03
100
HFRI
S&P Long / Short Equity
HFRX
M a Ap r-0 3 M r-03 a J u y-0 n 3 Ju -03 Au l-0 g 3 Se -0 3 p O -03 c N t-0 o 3 D v-0 ec 3 Ja -0 3 Fen-0 4 M b-0 a 4 Ap r-04 M r-0 a 4 Ju y-04 n Ju -0 Au l-0 4 g 4 Se -04 p O -04 c N t-0 o 4 D v -0 e 4 Ja c -0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-05 ay J u -05 nJu 05 Au l-05 g05
135
M a Ap r-0 M r-0 3 a 3 J u y-0 n 3 Ju -03 Au l-03 Se g-0 3 O p-0 c 3 N t-0 o 3 D v-0 ec 3 Ja - 0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 4 Jun-0 4 Au l-04 Seg-0 4 O p-0 c 4 N t-0 o 4 D v -0 ec 4 Ja -0 4 Fen-0 5 M b-0 a 5 Ap r-0 5 M r-0 a 5 J u y-0 n 5 Ju -05 Au l-0 g- 5 05
0.04
Fig. 7. Returns (monthly and cumulated) of the non-investable HFRI Equity Hedge Index, the investable HFRX Equity Hedge Index, and the (investble) S&P Long/Short Equity Index (all in light color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details) 37
38
The convertible bond index primarily serves as a proxy for high tech and small cap stocks. If we include the S&P 500 and Russell 2000 index a lot (but not all) of the explanatory power of the convertible bond index goes away. We would like to note here, however, that the substitution of the Convertible factor with a straight equity risk factor such as the S&P 500, yields R-squares which are only about 10% below the values reported here. The convertible bond index thus can be considered as a proxy for small cap (an possibly Telecom/Technology) exposure.
186
L. Jaeger
with similar performance and volatility. The RFS performs along the HFRI index despite some alpha displayed in Table 1. Figure 17 sheds some light on this discrepancy: Table 1 displays the average alpha over the regression period which as Fig. 17 indicates declines quite rapidly over time. Figure 7 in contrast only matches the most recent performance since 2003. There is only little alpha shown be Long/Short Equity managers in the most recent period as Fig. 17 indicates. Finally, the RFS outperforms both investable indices (HFRX and S&P) significantly. Equity Market Neutral Equity Market Neutral strategies aim at zero exposure to specific equity market factors. Correspondingly, the model in Table 1 shows only a small (however statistically significant) exposure to broad equity markets. However, the results indicate that the Equity Market Neutral style carries sensitivity to the Fama-French momentum factor UMD and the value factor (the spread of the MSCI value and growth indices). The R2 value of the regression for Equity Market Neutral comes out lowest for all strategy sectors next to the Managed Futures. In other words, simple linear models fall short of explaining a significant part of the variation of returns for this hedge fund style. However, to mix the right combination of systematic risk exposures of Equity Market Neutral strategies right, we must distinguish two distinctly different sub-styles of this strategy. The one (often system based) approach buys undervalued stocks and sells short overvalued stocks according to a value and momentum based analysis. The second more short term oriented approach (also referred to as “Statistical Arbitrage”) trades in pairs based on a statistical analysis of relative performance deviation of similar stocks. Both styles naturally have a different exposure to the factors examined here. Figure 8 confirms what the numbers in Table 1 indicate: The RFS underperforms the HFRI index by some margin reflecting the positive alpha in Table 1. However, it outperforms the HFRX investable index significantly. 0.02 RFS
HFRI
HFRX
115
0.015 0.01
110
0.005
RFS
HFRI
HFRX
0 105 −0.005 −0.01 −0.015
100
−0.02 95 M a Ap r-0 3 M r-0 a 3 J u y-0 n 3 Ju -03 Au l-03 Se g-0 3 O p-0 c 3 N t-0 o 3 D v-0 e 3 Ja c-0 n 3 Fe -0 4 M b-0 ar 4 Ap -0 4 M r-0 4 a Ju y-0 n- 4 Ju 04 Au l-04 Seg-0 4 O p-0 c 4 N t-0 o 4 D v -0 ec 4 Ja -0 n 4 Fe -0 5 M b-0 a 5 Ap r-0 5 M r-0 a 5 J u y-0 n 5 Ju -05 Au l-05 g05
M a Ap r-0 3 M r-0 a 3 J u y-0 n- 3 Ju 03 Aul-03 Se g-0 3 O p-0 c 3 N t-0 o 3 D v-0 e 3 Ja c - 0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Aul-04 Seg-0 4 O p-0 c 4 N t-0 o 4 D v -0 e 4 Ja c -0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-0 a 5 J u y-0 n- 5 Ju 05 Aul-05 g05
−0.025
Fig. 8. Returns (monthly and cumulated) of the non-investable HFRI Equity Market Neutral Index and the investable HFRX Equity Market Neutral Index (in light color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details)
The Road to Hedge Fund Replication
187
Short Selling The main exposure of the Short Selling strategy is, quite obviously, being short the equity market. Interestingly, the exposure to the broad equity markets can best be modeled with the same factor as for the Long/Short equity managers, the Convertible Bond Index. This indicates the same type of non-linear exposure as for the Long/Short Equity strategy, however with the signs inversed. The strategy displays positive sensitivity to value stocks with as measured by the spread between the MSCI value and growth indices. The alpha value for Short Selling strategies stands at around 4-5% p.a. This indicates that the short side does offer some profit opportunities, possibly explained in part by most investors being restricted from selling short. However, the alpha of this strategy must be high in order for the strategy to generate any profits at all. This is because from the perspective of risk factor exposure, shorting the equity markets starts off with an expected negative 4-7% return (long term performance of the equity markets minus short rebate for the short positions). As a result Short Selling is the only hedge fund strategy with negative past performance over the last 15 years. This is also reflected in Fig. 9 for the more recent period. We observe that the Short Selling strategy can be well replicated by the RFS model. Event Driven Event Driven hedge funds constitute an ensemble of various investment strategies around company specific events including restructuring, distress and mergers. According to our factor model in Table 1 the average Event Driven strategy comes with a rather simple exposure to the broad equity market, small cap stocks and the high yield bond market. Further the AR(1) factor indicates autocorrelation in returns reflecting liquidity risk and possible lagged pricing of the underlying securities. Our model explains an astonishing 80% of 0.08
100 RFS
HFRI
0.06
95 RFS
0.04 0.02 0 −0.02
HFRI
90
85
80
−0.04
M a Ap r-0 3 M r-03 ay J u -03 n Ju -0 Au l-0 3 3 Se g-0 p 3 O -03 ct N -0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Au l-0 4 Seg-0 p 4 O -04 c N t-0 o D v -04 e 4 Ja c -0 4 Fen-0 5 M b-0 a 5 Ap r-0 5 M r-05 ay J u -05 Jun-0 Au l-0 5 g- 5 05
−0.08
75
70 M a Ap r-03 M r-0 a 3 J u y- 0 n 3 Ju -03 Au l-03 Se g-0 3 O p-03 c N t-03 o D v -0 e 3 Ja c - 0 n 3 Fe -04 M b-0 ar 4 Ap -04 M r-0 a 4 Ju y-0 n 4 Ju -04 Au l-04 Seg-0 p 4 O -04 c N t-04 o D v -0 e 4 Ja c -0 n 4 Fe -0 5 M b-0 ar 5 Ap -05 M r-0 a 5 J u y-0 n 5 Ju -05 Au l-05 g05
−0.06
Fig. 9. Returns (monthly and cumulated) of the non-investable HFRI Dedicated Short Bias Index vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details). Note: An investable version of the HFR index does not exist for dedicated short hedge funds
188 0.05
L. Jaeger RFS
HFRI
HFRX
S&P Event Driven
0.04
160 RFS
HFRI
HFRX
S&P Event Driven
150
0.03 0.02 0.01 0
140
130
120
−0.01
M a Ap r-0 r 3 M -03 a J u y-0 n 3 Ju -03 Au l-0 3 Se g-0 3 O p-0 c 3 N t-0 o 3 D v -0 e 3 Ja c - 0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Au l-04 Seg-0 4 O p-0 c 4 N t-0 o 4 D v -0 e 4 Ja c -0 4 Fen-0 5 M b-0 a 5 Ap r-0 5 M r-0 a 5 J u y-0 n 5 Ju -05 Au l-05 g05
−0.03
110
100
M a Apr-03 M r-0 a 3 J u y-0 n 3 Ju -03 Au l-03 Se g-0 p 3 O -03 c N t-03 ov D -0 e 3 Ja c-03 n Fe -04 M b-0 a 4 Ap r-04 M r-0 a 4 Ju y-0 n 4 Ju -04 Au l-04 Seg-0 p 4 O -04 c N t-04 o D v -0 e 4 Ja c -0 n 4 Fe -0 5 M b-0 a 5 Ap r-05 M r-0 a 5 J u y-0 n 5 Ju -05 Au l-05 g05
−0.02
Fig. 10. Returns (monthly and cumulated) of the non-investable HFRI Event Driven Index and the investable HFRX Event Driven Index (in light color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details)
the variation of Event Driven returns. Alpha is the highest for any strategy in the hedge fund universe with roughly 5% p.a. over the analyzed period. This is also reflected in Fig. 10, where we see that the RFS model yields roughly about two thirds of the return of the Event Driven managers in the HFRI index. However, again, the RFS outperforms the HFRX and S&P investable index version significantly. Distressed Securities Distressed Securities strategies come with a simple set of exposures to credit, equity, particularly small cap equity, and liquidity risks. These are exactly the factors which show up in Table 1. The AR(1) factor bears the largest sensitivity, reflecting the low degree of liquidity offered in Distressed Securities investing. A lack of regular pricing and valuation induces autocorrelation in the return streams. The partly rather illiquid strategies closely resemble the return sources of private equity investment. The investor provides an important funding source for companies without access to traditional capital sources during important phases of their development; usually times of distress. In contrast to investors in regular stocks, an investor in distressed debt or equity just like a private equity investor has no direct access to his capital for several years. He is further exposed to uncertainty about the size and timing of future cash flows. Not surprisingly the level of alpha for Distressed hedge funds managers is around 3-4% p.a. which is along with its peers in other Event Driven sectors (e.g. Merger Arbitrage) among the highest in the hedge fund industry. This is also reflected in Fig. 11, where we see that that the RFS model yields roughly about half of the return of the Distressed managers in the HFRI index. Even the investable HFRX index outperforms the RFS.
The Road to Hedge Fund Replication 0.05
RFS
HFRI
HFRX
0.04 0.03
189
160
150
RFS
HFRI
HFRX
140
0.02 130 0.01 120 0
M a Ap r-0 3 M r-0 a 3 J u y-0 Ju n-0 3 Aul-033 Se g-0 p 3 O -03 N ct-0 o D v-03 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 Jun-0 4 4 l Au -04 Seg-0 4 O p-0 c 4 N t-0 o D v -04 ec 4 Ja -0 4 Fen-0 5 M b-0 a 5 Ap r-0 5 r M -0 a 5 J u y-0 n 5 Ju -05 l Au -05 g05
−0.02
110
100
M a Ap r-0 3 M r-03 a J u y-0 n 3 Ju -03 Aul-03 Se g-0 3 O p-0 c 3 N t-03 o D v-0 e 3 Ja c-0 n 3 Fe -04 M b-04 a Ap r-04 M r-04 a Ju y-0 n 4 Ju -04 Aul-04 g Se -04 p O -04 c N t-0 ov 4 D -0 e 4 Ja c -0 n 4 Fe -0 5 b M -0 ar 5 Ap -05 M r-05 ay J u -0 n 5 Ju -05 Aul-05 g05
−0.01
Fig. 11. Returns (monthly and cumulated) of the non-investable HFRI Distressed Index and the investable HFRX Distressed Index (in light color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details)
Merger Arbitrage In their seminal paper on the Merger Arbitrage strategy, Mitchel and Pulvino39 examine the conditional correlation properties of this strategy: Merger Arbitrage strategies display rather high correlations to the equity markets when the latter declines and comparably low correlations when stocks trade up or sideways. This corresponds to a correlation profile similar to that of a sold put on equities. As a matter of fact, the payout profile of Merger Arbitrage strategies corresponds directly to a sold put option on announced merger deals. This short put profile is reflected in the significance of the BXM factor in Table 1. Shorting put options provides limited upside but full participation on the downside (less the option premium). This argument extends beyond the immediate exposure to merger deals breaking up: When the stock market falls sharply, merger deals are more likely to break. In addition, a sharp stock market decline will reduce the likelihood of revised (higher) bids and/or bidding competition for merger targets. Falling stock markets also tend to reduce the overall number of mergers, which increases the competition for investment opportunities and may thereby reduce the expected risk premium. The strategy therefore has a slightly positive stock market beta, however strongly non-linear. This overall exposure profile to equity markets comes more from the correlation between the event risk and the market than from the individual positions. Mitchell and Pulvino calculated the historical track record of a simple rule-based merger arbitrage strategy that at any time invests in each announced merger deal, both cash and stock-swap, with a pre-specified entry and exit rule.40 They conducted this calculation for 4,750 merger transactions from 1963 to 1998. The hedge fund manager Bridgewater performed a very similar study but constrained themselves to the ten largest mergers at any point in time. In both cases the resulting simulated returns came very 39 40
See M. Mitchel and T. Pulvino, “Characteristics of Risk in Risk Arbitrage” (2001). See M. Mitchel and T. Pulvino, “Characteristics of Risk in Risk Arbitrage” (2001).
190
L. Jaeger
0.02
120 RFS
HFRI
HFRX
0.015 0.01
RFS
HFRI
HFRX
115
0.005 0
110
−0.005 −0.01
105
−0.015 100 M a Ap r-03 M r-03 a J u y-0 n 3 Ju -03 Aul-03 Se g-0 3 O p-0 c 3 N t-03 o D v-0 e 3 Ja c - 0 n 3 Fe -04 M b-0 ar 4 Ap -04 M r-04 a Ju y-0 n 4 Ju -04 Aul-04 Seg-0 p 4 O -0 c 4 N t-04 o D v -0 e 4 Ja c -0 n 4 Fe -0 5 M b-0 ar 5 Ap -05 M r-05 a J u y-0 n 5 Ju -05 Aul-05 g05
M a Ap r-03 M r-0 a 3 J u y-0 Ju n-0 3 3 Aul-03 Se g-0 3 O p-0 ct 3 N -0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 Jun-0 4 4 l Au 04 Seg-0 p 4 O -04 c N t-0 ov 4 D -0 e Ja c -04 4 Fen-0 5 M b-0 a 5 Ap r-0 5 M r-0 a 5 J u y-0 n Ju -055 Aul-05 g05
−0.02
Fig. 12. Returns (monthly and cumulated) of the non-investable HFRI Merger Arbitrage Index and the investable HFRX Merger Arbitrage Index (in light color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details)
close to the returns of the Merger Arbitrage hedge fund indices (HFR and Tremont). We included a strategy which focuses on investing exactly along the Mitchell/Pulvion study, the publicly available “Merger Fund.”41 Our regression shows what we expected, exposure to the equity markets, in particular the small cap segment (furthermore the value sector), the BXM index and the Merger Fund. However, the explanatory strength of the model is not that high (considering that these factor should very well reflect what the strategy is about). Just as with other Event Driven strategies the alpha value is above average for this strategy with around 4% p.a. However, a comparison with the performance of the RFS in Fig. 12 shows that the skill based component of returns has declined in recent years, as the RFS tracks the performance of the HFRI Merger Arbitrage rather closely. Again, the RFS outperforms the investable version of the HFR index by a safe margin. General Relative Value Relative Value strategies–represented here by Fixed Income Arbitrage and Convertible Arbitrage – have three types of systematic exposure. They first capitalize on price spreads between two or more related financial instruments which often represent a compensation for particular risks such as credit risk, interest rate term structure risk, liquidity risk, or exchange rate risk. Secondly, they provide liquidity and price transparency in complex instruments employing proprietary valuation models to value complex financial instruments. Related returns can be referred to as liquidity and “complexity” premia. The latter is related to the risk of mis-modeling the complexity of the underlying financial instrument. The hedge fund manager is short an option which turns strongly into the money when his valuation model is inaccurate. Finally, Relative Value Hedge fund managers have a preference for negatively 41
Bloomberg ticker: MERFX US Equity.
The Road to Hedge Fund Replication
191
skewed return distribution, where steady but small gains are countered with rare but large losses. In other words, the managers are short some sort of volatility, which makes the return profile resemble the payout profile of a short option position. Fixed Income Arbitrage Fixed Income Arbitrage strategies often expose themselves to a combination of liquidity, credit and term structure risks, e.g. through credit barbell strategies (long short-term debt of lower credit quality and short long term government bonds), yield curve spread trades, or on-the-run vs. off-the-run treasury bond positions. Exposure to credit risk, convertible bonds and emerging market bonds securities are most prevalent, as Table 1 indicates. The significance of the AR(1) term indicates autocorrelation in returns signaling lagged pricing of the underlying securities and reflects liquidity risk. According to our factor model the alpha value for Fixed Income strategies is in the region of 2.5% p.a., and the model explains around 41% of the variations of returns. Fung and Hsieh42 chose another–but similar–set of factors including options on interest spreads (they call these “ABS factors”) to model various Fixed Income Arbitrage trading styles. They obtain slightly higher R2 values than presented in our study here. Their and our results explain why the heaviest losses of this style occurred in “flight to quality” scenarios, when credit spreads suddenly widen, liquidity evaporates and emerging markets fall sharply. Events like the summer 1998 remind us that the strategy bears a risk profile similar to a short option, with the risk of significant losses but otherwise steady returns. It is inherently difficult to model the exposure to these extreme events, as they are so rare that their true likelihood is hard to calculate. However, the hedge fund investor should nevertheless keep this exposure in mind. Figure 13 shows that the RFS returns cannot quite keep up with the HFRI returns coherent with in our results in Table 1. 120
0.025
118
0.02 RFS
HFRI
0.015
112
0.005
110
−0.005
108 106 104 102
−0.015
100
M a Ap r-0 3 M r-0 a 3 J u y-0 n- 3 Ju 03 Aul-03 Se g-0 3 O p-0 c 3 N t-0 o 3 D v-0 e 3 Ja c-0 n 3 Fe -0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -04 Aul-04 Seg-0 4 O p-0 c 4 N t-0 o 4 D v -0 e 4 Ja c -0 4 Fen-0 5 M b-0 ar 5 Ap -0 5 M r-0 a 5 J u y-0 n- 5 Ju 05 Aul-05 g05
−0.01
M a Ap r-0 3 M r-03 a J u y-0 n- 3 Ju 0 3 Aul-03 Se g-0 p 3 O -03 c N t-0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-04 a Ju y-0 4 Jun-0 4 Aul-04 g Se -0 p 4 O -04 c N t-04 o D v -0 e 4 Ja c -0 4 Fen-0 5 M b-0 a 5 Ap r-0 5 M r-05 a J u y-0 n 5 Ju -05 Aul-05 g05
HFRI
114
0.01
0
RFS
116
Fig. 13. Returns (monthly and cumulated) of the non-investable HFRI Fixed Income Index vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details). Note: An investable version of the HFR index does not exist for Fixed Income Arbitrage hedge funds 42
See W. Fung, D. Hsieh, “The Risk in Fixed Income Hedge Fund Styles” (2002).
192
L. Jaeger
Convertible Arbitrage Convertible Arbitrage hedge funds are exposed to a variety of different risk factors: Credit risk, equity market and equity volatility risk, and liquidity risk. These factors – the high yield factor, convertible and equity factor, and the AR(1) factor – also appear as the relevant factors in Table 1. As for Fixed Income Arbitrage, the Convertible Arbitrage model shows a significant AR(1) terms which indicates autocorrelation in returns also for this strategy. This signals a lack of consistent and timely pricing of the underlying convertible securities and reflects exposure to liquidity risk and valuation risk. To mix the right combination of these risks however, we must distinguish two distinctly different sub-styles of Convertible Arbitrage strategies. The option-based Convertible Arbitrage style simply buys the convertible bond, sells short the underlying equity and re-establishes a delta hedge frequently, a trading technique referred to a gamma-trading. This style tries to hedge out credit risk as much as possible and thus cares little about the credit markets. The second - credit-oriented - style makes an explicit assessment of the issuer’s creditworthiness and takes overpriced credit risk. Both styles naturally have a different exposure to the credit markets. Naturally, the credit-oriented sub-style of Convertible Arbitrage carries a significant exposure to credit risk, while the option-based sub-style does not. As credit risk is correlated with equity markets the second style has a less well-defined sensitivity to falling equities. Increasing volatility helps the strategy, but widening credit spreads hurt it. The option-based gamma trading style, in contrast, performs better in a volatile environment in which equities are falling, which explains the overall negative correlation of Convertible Arbitrage hedge funds to the equity markets in Table 1. Declining volatility leads this strategy to under-perform during the period of decline. The dual nature of Convertible Arbitrage hedge funds led to an interesting development in 2003 which confused some investors. In an environment of simultaneously rapidly declining credit spreads and equity volatility, credit oriented Convertible Arbitrage strategies displayed stellar performance while the gamma traders displayed disappointing returns that hovered near zero. This divergence in style is currently not reflected in the available hedge fund indices, which makes it more difficult for factor models to capture the sensitivities of the style. To correctly evaluate these two variants of Convertible Arbitrage, we would need a separate index for each sub-style. In a recent research paper43 , V. Agarwal et al. separate the key risk factors in Convertible Arbitrage strategies: equity (and volatility) risk, credit risk, and interest rate risk. Consequently they design three “primitive trading strategies” to explain the returns of the strategy in terms of the key risk factors and premia captured by these strategies: positive carry, credit risk premium (“credit arbitrage”) and 43
V. Agarwal, W. Fung, Y. Loon, N. Naik, “Risks in Hedge Fund Strategies: Case of Convertible Arbitrage” (2004).
The Road to Hedge Fund Replication 0.04 RFS
HFRI
HFRX
193
110 RFS
HFRI
HFRX
0.03 0.02
105
0.01 0
100
−0.01 −0.02
95
−0.03 90
M a Apr-03 M r- 0 ay 3 Ju -03 n Ju -03 A u l-03 Seg-0 p 3 O -03 c No t-0 3 Dev-0 3 Ja c -0 n- 3 Fe 0 4 M b- 0 a 4 Apr-04 M r-04 a Ju y -0 n 4 Ju -04 Au l-04 g Se -04 p O -04 N ct-0 ov 4 D -0 e 4 Ja c-0 n 4 Fe -05 M b-0 a 5 A p r-0 5 M r- 0 a 5 Ju y-0 n- 5 Ju 05 Au l-05 g05
M a A p r-0 3 M r- 0 a 3 Ju y-0 n 3 Ju -03 A u l-03 Seg-0 3 O p-0 c 3 No t-0 3 Dev-0 3 Ja c -0 3 Fen-0 4 M b-0 ar 4 A p -0 M r-0 4 a 4 Ju y -0 n 4 Ju -04 Au l-04 Seg-0 4 O p-0 c 4 N t- 0 o 4 D v -0 ec 4 Ja -0 4 Fen-0 5 M b-0 a 5 A p r-0 5 M r- 0 a 5 Ju y-0 n 5 Ju -05 Au l-05 g05
−0.04
Fig. 14. Returns (monthly and cumulated) of the non-investable HFRI Convertible Arbitrage Index and the investable HFRX Convertible Arbitrage Index (in light color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details)
gamma trading (“volatility arbitrage”). They investigate these factors in the US and Japanese convertible market. These factors can explain up to 54% of the return variation of Convertible Arbitrage indices. According to our factor model the alpha value for Convertible Arbitrage Income strategies is in the region of 2% p.a, and the model explains around 65% of the variations of returns. However, we observe for the more recent period that a RFS model outperforms the HFRI Convertible Arbitrage strategy slightly with significantly less volatility as shown in Fig. 14. The outperformance becomes even more striking when considering the investable HFRX index. Global Macro Global Macro managers of all types do better in strong bond markets, as indicated by the strong sensitivity to the bond market index shown in Table 1. Other exposures are less obvious: exposure to the risk characteristic to trend following strategies (the sGFI factor) and some non-linear exposure to the broad equity market (convertible bond factor). The R2 value for the regression of Global Macro comes out relatively low (50%). We assume this is due to the heterogeneity of the strategy. Global Macro trading includes a wide range of different trading approaches, and a broad index does not reflect this diversity. A manager-based analysis would be more appropriate here. More than a broad asset class based index or a generic trading strategy, it is the particular markets traded by the individual manager and his particular investment techniques that define the available risk premia and inefficiencies targeted. However, note that our model gives an alpha value of around 3% p.a. for the average Global Macro manager. This is correspondingly reflected in Fig. 15, showing an underperformance of RFS of around 3-4% p.a.. But again, the non-investable version underperforms the RFS.
194
L. Jaeger 130
0.06 RFS
HFRI
RFS
HFRX
0.02
120
0
115
−0.02
110
−0.04
105
−0.06
100
HFRI
HFRX
M a A p r-0 3 M r-0 a 3 Ju y-0 n 3 Ju -03 A u l-03 Seg-0 3 O p-0 c 3 No t-03 Dev-0 3 Ja c -0 n 3 Fe -0 4 M b-0 a 4 Apr-04 M r-0 a 4 Ju y - 0 n 4 Ju -04 Au l-04 Seg-0 4 O p-04 c N t- 0 o 4 D v -0 e 4 Ja c-0 n 4 Fe -05 M b-0 a 5 A p r-0 5 M r- 0 a 5 Ju y-0 n- 5 0 Ju 5 Au l-05 g05
125
M a A p r-0 M r-03 a 3 Ju y-0 n 3 Ju -0 A u l-0 3 3 Seg-0 3 O p-0 ct 3 No -0 3 Dev-0 3 Ja c -0 Fen-0 3 4 M b-0 ar 4 A p -0 M r-0 4 a 4 Ju y -0 n 4 Ju -0 4 Au l-04 Seg-0 4 O p-0 c 4 N t- 0 o 4 D v -0 ec 4 Ja -0 4 Fen-0 5 M b-0 a 5 A p r-0 M r-05 a 5 Ju y-0 n 5 Ju -0 5 Au l-05 g05
0.04
Fig. 15. Returns (monthly and cumulated) of the non-investable HFRI Global Macro Index and the investable HFRX Global Macro Index (in light color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details)
Managed Futures Managed Futures hedge funds are the main speculative agents in the global futures markets, thus capturing what we referred to as the “commodity hedging demand premium.” A simple trend following trading rule (sGFII) applied to the major global futures markets captures a large part of these returns and shows up as the most dominant term in the regression in Table 1. Several different studies have independently obtained this result.44 The sGFII index is designed to model the return of trend following strategies with a simple rule based momentum approach. It is a volatility weighted combination of trend following strategies on 25 liquid futures contracts on commodities, bonds, and currencies. This index shows a 48% correlation with the CISDM trend following index, and equally a 48% correlation with the CSFB/Tremont index. Based on the regression in Table 1 the average CTA in the CISDM Trendfollower index displays negative alpha. Schneeweis/Spurgin and Jensen and Rotenberg (Bridgewater) use similar trend following indicators on a much more restricted set of contracts45 . They obtain an even higher correlation coefficient to the CSFB-Tremont Managed Futures index (71% in the case of Bridgewater) or the CISDM Managed Futures Indices (79% against the CISDM Trend following index for Schneeweis/Spurgin). The lower correlation of the sGFII index is possibly due to a comparably high exposure to commodity contracts compared to Bridgewater’s and Schneeweiss/Spurgin’s model (which overweigh the complex of financial futures contracts). 44
45
See L. Jaeger et al., “Case study: The sGFI Futures Index” (Summer 2002); Jensen. G., Rotenberg, J., “Hedge Funds Selling Beta as Alpha” (2003); R. Spurgin, “A Benchmark on Commodity Trading Advisor Performance” (1999). T. Schneeweis and R. Spurgin, “Multifactor Analysis of Hedge Funds, Managed Futures, and Mutual Fund Returns and Risk Characteristics” (1998); G. Jensen and J. Rotenberg “Hedge Funds Selling Beta as Alpha” (2003).
The Road to Hedge Fund Replication 0.08 RFS
CISDM
S&P Managed Futures
120
RFS
CISDM
195
S&P Managed Futures
0.06 115 0.04 0.02
110
0 105 −0.02 −0.04
100
−0.06
95
−0.08
M a Ap r-0 3 M r-03 a Ju y-03 n Ju -0 Au l-0 3 3 Seg-0 p 3 O -03 N ct-0 o 3 D v-0 e 3 Ja c-0 3 Fen-0 4 M b-0 a 4 Ap r-0 4 M r-0 a 4 Ju y-0 n 4 Ju -0 Au l-0 4 4 Seg-0 p 4 O -04 N ct-0 o 4 D v-0 e 4 Ja c-0 4 Fen-0 5 M b-0 a 5 Ap r-0 5 M r-05 a Ju y-05 n Ju -05 l-0 5
M a Apr-03 M r-03 ay Ju -0 n 3 Ju -03 Au l-03 Seg-0 p 3 O -03 c N t-03 o D v-0 e 3 Ja c-03 n Fe -04 b M -04 ar Ap -04 M r-0 a 4 Ju y-0 n 4 Ju -04 Au l-04 g Se -0 p 4 O -0 c 4 N t-04 o D v-0 e 4 Ja c-04 n Fe -05 M b-0 a 5 Apr-05 M r-05 a Ju y-0 n 5 Ju -05 l-0 5
90
−0.1
Fig. 16. Returns (monthly and cumulated) of the non-investable (!) CISDM Managed Futures Qualified Universe Index (in grey color) vs. the RFS cumulative return (in dark color) based on the factor returns (see text for details)
An interesting model for trend-following strategies was proposed by Fung and Hsieh. They constructed their trend-following factor using look back straddle payout profiles on 26 liquid global futures contracts and the corresponding options (across equities, bonds, currencies and commodities). A look back straddle pays the difference between the highest and lowest price of the reference asset in the period of time until maturity of the option, mimicking the payout of a trend-follower with perfect foresight. The degree of explanatory power of their model is around R2 =48%, higher than all three models described above. Note that the Managed Futures strategy is the only hedge fund sector which displays negative alpha (albeit not at a statistically significant level). We can observe the corresponding performance pattern of CTAs compared to the RFS in Fig. 16: The performance of the RFS and the average CTA in the CISDM Managed Futures Qualified Universe Futures Index are very well in line, while the investable S&P Managed Futures index underperforms both by a significant margin.
7 The Future of Alpha There is good reason to believe that generally the average alpha extracted by hedge fund managers is destined to decline. As a matter of fact, we can already today observe that alpha has grown smaller in size over time, as Fig. 17 indicates for the most obvious strategy, Long/Short Equity, where we display the alpha of a rolling regression over a 60 months time window. Independently from our research, the attenuation of alpha has been observed elsewhere. Fung et al. report in one of their latter research on the same phenomenon.46 One possible explanation for this phenomenon comes quickly to mind: As more 46
W. Fung, D. Hsieh, N. Naik, T. Ramadorai, “Hedge Fund: Performance, Risk and Capital Formation”, Preprint (2005)
196
L. Jaeger
1.20% Rolling Alpha
1.00% 0.80% 0.60%
average alpha: 0.56%
0.40% 0.20% 0.00%
00 00 00 00 01 01 01 01 02 02 02 02 03 03 03 03 04 04 04 04 05 05 05 n- pr- ul- ct- an- pr- ul- ct- an- pr- ul- ct- an- pr- ul- ct- an- pr- ul- ct- an- pr- ula J A J O J A J O J A J O J A J O J A J O J A J
Fig. 17. The development of alpha for Long/Short Equity funds (HFR sub-index) based on a rolling regression over a 60 month time window. The risk factors were chosen as in Table 1
money chases a limited of market inefficiencies, those inefficiencies should decrease or even going to disappear. In other words, the capacity for alpha is limited. However, there is no good reason to believe that the global “capacity for alpha” which is ultimately a function of how many inefficiencies the average global investor (and the corresponding regulatory agencies) will tolerate actually decreased over time that dramatically. While hedge funds grow strongly and possibly have to compete harder with other “alpha chasers” they remain a rather small portion of the global investment activity. Another parallel explanation for the displayed decrease in alpha is the quality of the average hedge fund manager. The number of managers has multiplied in recent years, and it reasonable to assume that today’s low entry barriers to starting a hedge fund attracts numerous managers with a lower level of skill. These tend to dilute the average performance and thus the average alpha of the entire hedge fund industry. An interesting research topic which we leave for future efforts is to test for the average alpha in the top percentile of managers. Will the “alpha” in hedge funds disappear entirely? Probably not, but it will become harder to identify and isolate it in the growing jungle of hedge funds. However, we have seen that alpha constitutes a statistically significant variable (though decreasing over time) in most of our regression models. We might be missing explanatory variables in our models, and future modeling effort will hopefully lead us to better models to answer this question. Another approach is to model the behavior of the alpha output of our models in changing market conditions as well as over time. Alpha might depend on market related variables other than prices which are not so easily captured in
The Road to Hedge Fund Replication
197
our risk based models, such as trading volume, open short interest on stocks, insider activity, leverage financing policies of prime brokers, etc. A direct dependency of the hedge fund managers’ alpha creation from these variables will lead us to a better understanding of their time variability that we empirically observe in our models. This will ultimately lead us to an understanding of the very alpha creation process of hedge funds, the part of hedge fund returns which remains still in the dark for most investors. However, little effort has been put into this task so far. The main task of the investor will be to define what he wants from hedge funds. Alpha is and will continue to be ultimately the most attractive sort of return, as it comes with no systematic risk and no correlation to other asset classes. But investors should realize both the scarcity of true alpha and the power of alternative beta. It is the power of diversification into orthogonal risk factors which will ensure that hedge funds remain broadly attractive for investors. And when it come to the hedge funds’ beta there is surely a great deal larger capacity available to investors than in the case of alpha. In fact, the future growth prospects of the hedge fund industry become quite compelling considering that we are far from any limit with respect to “beta capacity” in the hedge fund industry. While the search of alpha surely remains compelling, we believe it is investment in alternative betas which will be more and more the key to successful hedge fund investing in the future.
8 The Future of Hedge Fund Capacity Now that we are in a position to provide a rough breakdown of hedge funds return sources we can approach a question which lies at the heart of future hedge fund growth: the issue of capacity. For this purpose we perform a set of rather simple calculations:47 We know that the global market capitalization of all public stocks and debt is around 88’000 billion USD (about 51’300 USD in bonds, 36’700 USD in equity).48 Generating alpha in the global capital markets is an overall zero sum game, i.e. if hedge fund managers win this game, i.e. generate positive alpha, there must be other market participants being on the losing end. We must thus assume an average tolerance level for inefficiencies, i.e. negative alpha, by equity and bond investors world wide before competitive (or regulatory) forces step in to keep this number from getting larger. We estimate this number to be in the range of 0.25% p.a. on average across all equity and bonds investors.49 With this number we can calculate 47
48
49
Note that this calculation is very similar in spirit and takes some of its concepts from the work of H. Till, “The capacity implications of the search of alpha” (2004). Source: www.fibv.com/publications/Focus0605.pdf and http://www.imf.org/external/pubs/ft/GFSR/2005/01/index.htm. H. Till uses another number but aggregates the overall size of the market only over the holdings of HNWI, mutual funds and institutional funds. Considering our base number of 88000 billion USD the assumptions are rather similar.
198
L. Jaeger
the overall alpha in the global equity and bond market to be USD 220 billion. We must further assume that hedge funds can participate from this “alpha pie” only to a certain extent next to other professional players which are likely to be “positive alpha players” and thus compete with hedge funds for alpha (proprietary trading operations, large institutions, mutual funds – before their fees, etc.). It seems realistic to assume that hedge funds can take one fourth of that pie50 (a proportion which might grow larger over time, however, as more players from the other “alpha parties” move into the hedge fund space). This implies that there are USD 55 billion pure alpha available to hedge funds each year. Further, assuming that hedge fund investors require a least a 15% p.a. return gross of fees (before management, performance, trading fees, etc.), which amounts into a net return of around 8%-10% and constitutes probably the minimum investors would require from hedge funds. This implies an overall capacity of hedge funds based on alpha only of USD 55 billion/0.15 = 366.6 billion USD, about one third of the actual size of assets in the hedge fund industry. Even with different, more beneficial assumptions on the overall investor tolerance for inefficiencies and on how much hedge funds can participate in the total “alpha pie”51 , we would not come up with a capacity significantly higher than the current size of the industry. As a result, based on the assumption of inefficiencies in the global capital markets alone, we are not just lacking a satisfying economic explanation of hedge fund return sources, we also find ourselves in a position not being able to explain the current size of the industry! But by now we understand that a large portion of hedge fund returns is not related to pure alpha, but rather to “alternative beta.” The analysis in our research suggests that a large part of the average hedge fund return stems from alternative beta rather than alpha. We now consider our estimate for that part to be as high as 80%. Well, this raises the bar for hedge fund capacity significantly higher. Going along with our conclusion and estimating that only 20% of the industry returns is related to pure alpha, we can calculate the capacity of the industry to be 366.6 billion USD/0.2 = 1’833 billion USD, about twice its current size. However, as large as this number seems, it is exceeded by some of the estimates given by industry protagonists as to what level the industry will grow within the following years. How can this growth be managed considering our numbers? The answer is obvious: Only by including a larger share of alternative beta in the overall return scheme of hedge funds. Assuming that the ratio of alpha vs. alternative beta becomes 10%, the capacity reaches the number of 3670 billion USD (assuming that the capacity of alternative beta is not limited at these levels, a fair assumption in our view).
50 51
The reader is invited to perform the calculation with different numbers. The reader may use his own set of assumptions.
The Road to Hedge Fund Replication
199
Summarizing, there is indeed plenty of room for the hedge fund industry to grow, albeit only at the expense of becoming more and more beta driven. This development will inevitably occur with the future growth of hedge funds. As a matter of fact, recent performance suggests that this process has already started.
9 Summary and Conclusion The key to the hedge fund ‘black box’ is the understanding that hedge funds generate returns primarily through risk premia and only secondarily by exploiting inefficiencies in imperfect markets. Conceptually hedge funds are therefore nothing really new in that just as an equity mutual fund extracts the equity risk premium, a hedge fund may try to extract various other risk premia awarded for, say, credit risk, interest rate risk or liquidity risk. The important difference however is, that the underlying risk premia are more diverse than those in traditional asset classes (which led us to refer to these premia as “alternative betas”). This insight is slowly spreading among the most sophisticated circles in the hedge fund industry. The underlying systematic risks can be readily analyzed and understood by investors, while the remaining parts of returns from inefficiencies are more difficult to describe in an unambiguous way. The risk premia available to hedge fund managers are the same as those available to other investors. However, extracting those premia in markets unfamiliar to most investors requires special expertise. Like the mining engineer who can profitably extract gold from low-grade ore that would previously have been left in the ground, skilled fund managers are simply more efficient in identifying existing risk premia, and trading with minimal undesired risk exposure and transaction costs to extract them. One of the pitfalls of hedge funds is that alpha and beta currently do not come separate but in an uncontrolled and perhaps undesired combination. Traditional portfolio management has developed a setting, which could equally be applicable for hedge fund investors: the “core-satellite” framework. Here, alpha generation and beta extraction are well separated - and very differently compensated. We believe hedge fund investors will want to walk down the same road. Hedge fund product providers might have to find a way to isolate and extract the alpha from the beta in hedge funds. This is the idea of “portable alpha”: Isolate alpha in one asset class and transfer it into the portfolio consisting of other types of assets. If a fund manager claims to produce alpha, why not take out the beta part of his returns with an active hedging overlay approach and keep only the alpha. A recent paper by B. Fung and D. Hsieh52 provides some interesting insights into a possible implementation of that idea and also gives some useful estimates about size and distributional properties of the “alpha returns” for Long Short Equity strategies. 52
W. Fung, D. Hsieh “Extracting Portable Alpha from Equity Long/Short Hedge Funds,” Journal of Investment Management (2004).
200
L. Jaeger
Currently available indices or benchmarks which rely on manager and peer group averages do not necessarily provide a sufficiently accurate picture of the industry or strategy sector performance due to various well known biases. The situation does not become much better when the indices are designed to be investable. At the same time, the demand and necessity of hedge fund indices for the purpose of measuring manager performance, classifying investment styles, and generally creating a higher degree of transparency is high and increasing. Some index providers actually claim that funds of funds have started to invest in investable indices to gain the desired exposure. While the authors are not aware of such behavior, they can surely not exclude that some of the less sophisticated fund of funds have bought the marketing story of the index providers. But if we acknowledge that the currently existing investable indices are no valid choice, what can we do? One way suggested in this article is to create synthetic benchmarks based on the factor exposure of hedge fund strategies to the underlying risk factors. This could potentially be a much better choice for fund of funds and other investors to gain the desired broad exposure to the hedge fund styles. At the same time these replicating factor strategies (RFS) can serve fund of funds as a benchmarking tool to judge the performance, to be more precise, the alpha generation, of their managers. First results described here and elsewhere look promising for some strategy sectors. However, a great deal of work remains to be done for other strategies. We observe that a corresponding replication of hedge fund indices by “replicating factor strategies” (RFS) lives up to the returns of the (non-investable) hedge fund strategy sector indices for some strategy sectors, in particular Long/Short Equity, Merger Arbitrage, Managed Futures, and Convertible Arbitrage. These strategies make up significantly more than 50% of the assets allocated to hedge funds! But as we emphasized in this article, these non-investable indices are actually not a good measure for hedge fund return that an investor would actually obtain on average, but overestimate their size significantly. In contrast to the non-investable hedge fund indices the RFS can be made investable without impacting their returns. When we compare the returns of the RFS with the corresponding version of the investable indices, their outperformance becomes even more striking: The RFS actually outperform the entire range of investable indices by a safe margin with the one exception of the Distressed sector. One must wonder why this is so. The flippant but accurate answer is: fees. Taking out an average of 2% management fees and a share of 20% performance fees for the single hedge fund manager actually eats up all and often more of the skill based returns hedge fund managers offer on average. We emphasize that the last two words written in italics are important: “on average.” With the inflation of new often mediocre managers average alpha has been coming down. However, we acknowledge that there continue to exist highly skilled hedge fund managers which continue to generate persistent alpha even after their (hefty) fees. It remains the skill of the experienced hedge fund investor/fund of funds to find and invest in them.
The Road to Hedge Fund Replication
201
At the end of this report we would like to point out a further direction of research possibly not sufficiently covered in this research. Our analysis suggests that the factor loads of hedge fund strategies are adequately modelled as stationary. However, there is good reason to believe (and recent research provides some evidence53 ) that there occur sudden and structural breaks in the systematic risk exposures of hedge funds that cannot be modelled well enough in a linear model context. Examples of such are easy to find: The blow up of LTCM in the summer of 1998, the burst of the stock market bubble in the spring of 2000, the turn in the equity market in March 2003. Upon a closer look, a closer look at Fig. 4 reveals some evidence for such breaks, which our analysis here does not account for. In order to model hedge fund exposure during these breaks occurring in extreme market environment we need nonlinear exposure models. We will leave this topic for future research. Generally, the progress recently on understanding the generic sources of hedge fund returns leads us to the conclusion that investable benchmarks constructed by a joint venture of financial engineers and quant groups based on risk factor analysis and replication has the potential to offer a valid, theoretically more sound, and cheaper alternative to the currently offered hedge fund index products offered today. It is evident that once these indices become more broadly recognized the hedge fund industry will be put upside down. This will have some further important consequences on how hedge funds are categorized by investors. So far, most consider them a separate asset class. Realizing that hedge funds regarding their exposure to systematic risk factors are conceptually not that different from traditional types of investments investors may find it conceptually easier to integrate them into their overall asset allocation.
References [1] Agarwal, V., Naik, N., “Performance Evaluation of Hedge Funds with Option-Based and Buy-and-Hold Strategies,” Working paper (2001), published under the title: “Risks and Portfolio Decisions involving Hedge Funds”, Review of Financial Studies, 17, p. 63 (2004) [2] Agarwal, V., Fung, W., Loon, Y., Naik, N., “Risks in Hedge Fund Strategies: Case of Convertible Arbitrage,” Working Paper, London Business School (2004) [3] Amin, G., Kat, H., “Welcome to the Dark Side: Hedge Fund Attrition and Survivorship Bias over the Period 1994-2001,” Journal of Alternative Investments, (Summer 2003) [4] Asness, C., Krail, R., Liew, J., “Do hedge funds hedge?”; Journal of Portfolio Management, 28, 1 (Fall 2001) 53
W. Fung, D. Hsieh, N. Naik, T. Ramadorai, “Hedge Fund: Performance, Risk and Capital Formation”, Preprint (2005)
202
L. Jaeger
[5] Asness, C., “An Alternative Future, I & II” Journal of Portfolio Management, (October 2004) [6] Bailey, J., “Are Manager Universes Acceptable Performance Benchmarks,” The Journal of Portfolio Management, (Spring 1992). [7] Brown, S, Goetzmann, W., Ibbotson, R., Ross, S., “Survivorship Bias in Performance Studies”, Review in Financial Studies, 5, 4 (1992) [8] Brown, S, Goetzmann, W., Ibbotson, R., “Offshore hedge funds: Survival and performance 1989-1995”, Journal of Business, 92 (1999) [9] Brown, S., Goetzmann, W. (2003), “Hedge Funds With Style,” Journal of Portfolio Management, 29, 101-112 [10] Capocci, D., H¨ ubner, G., “Analysis of hedge fund performance,” Journal of Empirical Finance, 11 (2004) [11] Crowder, G., “Hedge Fund Indices”, Journal of Alternative Investments, (Summer 2001) [12] Fama, E., French, K., “Common risk factors in the return of stocks and bonds”, Journal of Financial Economics, 33 (1993) [13] Fama, E., French, K., “Multifactor explanations of Asset Pricing Anomalies”, Journal of Finance, 51, 55 (1996) [14] Fung, W., Hsieh, D., Naik, N., Ramadorai, T., “Hedge Funds: Performance, Risk and Capital Formation (July 19, 2006). AFA 2007 Chicago Meetings Paper available at SSRN: http://ssrn.com/abstract=778124 [15] Fung, W., Hsieh, D., “Empirical Characteristics of Dynamic Trading Strategies: The Case of Hedge Funds”, The Review of Financial Studies, 10, 2 (1997) [16] Fung, W., Hsieh, D., “The Risk in Hedge Fund strategies: Theory and Evidence from Trend-Followers,” The Review of Financial Studies, 14, 2, p. 313 (Summer 2001) (2001). [17] Fung, W., Hsieh, D., “Benchmarks of Hedge Fund Performance: Information Content and Measurement Biases,” Financial Analyst Journal (2001). [18] Fung, W., Hsieh, D., “The Risk in Fixed Income Hedge Fund Styles”, Journal of Fixed Income, 12, 2 (2002) [19] Fung, W., Hsieh, D., “The Risk in Hedge Fund Strategies: Alternative Alphas and Alternative Betas” in L. Jaeger (ed.), “The new generation of risk management for hedge funds and private equity investment”, Euromoney (2003); [20] Fung, W., Hsieh, D., “Hedge Fund Benchmarks: A Risk Based Approach”, Working Paper (2004) [21] Fung, W., Hsieh, D., “The Risk in Long/Short Equity Hedge Funds”, Working Paper, London Business School, Duke University (2004) [22] Fung, W., Hsieh, D., “Extracting Portable Alpha from Equity Long/ Short Hedge Funds” (2004), Journal of Investment Management, 2, 4, 1-19
The Road to Hedge Fund Replication
203
[23] Getmansky M., Lo, A. W., Makarov, I., “An Econometric Model of Serial Correlation and Illiquidity in Hedge Fund Returns”, Journal of Financial Economics, 74 (3), 529-610; Economics (2004) [24] Jaeger, L , “Through the Alpha Smoke Screens: A Guide to Hedge Fund Return Sources”, Euromoney Institutional Investors (2005) [25] Jaeger, L., “Hedge Fund Indices – A new way to invest in absolute returns strategies?”, AIMA Newsletter (June 2004) [26] Jaeger, L. (ed.), “The new generation of risk management for hedge funds and private equity investment”, Euromoney (2003) [27] Jaeger, L., “Managing Risk in Alternative Investment Strategies”, Financial Times/Prentice Hall (May 2002) [28] Jaeger, Lars, “Sources of Return for Hedge funds and Managed Futures”, The Capital Guide to Hedge Funds 2003, ISIPublications (Nov. 2002) [29] Jaeger, L., Jacquemai, M., Cittadini, P., “Case study: The sGFI Futures Index,” The Journal of Alternative Investment (Summer 2002). [30] Jensen, J., Rotenberg, J., “Hedge Funds Selling Beta as Alpha” Bridgewater (2003, updated 2004 and 2005) [31] Kohler, A., “Hedge Fund Indexing: A square Peg in a round hole”, State Street Global Advisors (2003). [32] Malkiel, B., Saha, A., “Hedge Funds: Risk and Return”, Working Paper (2004) [33] Mitchel, M, Pulvino, T, “Characteristics of Risk in Risk Arbitrage”, Journal of Finance, 56, 6, 2135 (2001) [34] Schneeweis, T., “A Review of Alternative Hedge Fund Indices.” Schneeweis Partners (2001); [35] Steven A. Schoenfeld, “Active Index Investing”, Wiley, New York (2004) [36] Spurgin, R., “A Benchmark on Commodity Trading Advisor Performance”, Journal of Alternative Investments (Fall 1999) [37] Sharpe, W., “Asset Allocation: Management style and performance measurement”, Journal of Portfolio Management, 2, 18 (Winter 1991) [38] Till, H., “The capacity implications of the search of alpha”, AIMA Newsletter (2004)
Asset Securitisation as a Profits Management Instrument Markus Schmidtchen KfW Bankengruppe, Frankfurt, Germany, [email protected]†
1 Introduction The credit derivatives market has enjoyed a strong growth in liquidity for loan products and credit derivatives for some years now. A growing number of players populate both the supply and demand sides of the market, leading to product diversification on the one hand and a broadening of demand on the other. Rapid growth has been observed, in particular, in the single name credit default swap market, through which institutions can hedge against credit default by large, well-known enterprises. However, the instrument that can be used to hedge against credit default by small and medium-sized enterprises (SMEs) is portfolio securitisation, through which loans or loan default risk are transferred as a package. One of the reasons for making a bundled transfer is the limited exposure to individual SMEs. In parallel with the development in the capital markets, many credit institutions are re-organising their credit risk management units. This is being driven, on the one hand, by the regulatory demands in credit risk measurement under Basel II. On the other hand, greater capital market liquidity is making loan products more mobile. Banks are conducting increasingly active credit portfolio management. For example, the capital market may be used to deliberately increase credit exposure by means of an investment or to deliberately reduce risk by loan securitisation. This enables a bank to optimise its credit portfolio, which in turn has a positive impact on its profits position. Even if there are usually a number of different reasons for portfolio securitisation, it is clear that economically a transaction is appropriate when the economic capital released by risk reduction and reinvested in new lending business generates enough profits to cover the cost of the securitisation. †
This article presents the author’s opinion only and not an official statement of KfW views.
206
M. Schmidtchen
This article compares the implications of different securitisation strategies in both the bank’s overall portfolio risk as well as its return on capital. It shows that under specific assumptions, a securitisation strategy in which both the first loss position and the senior tranche are retained by the placing institution turns out to be optimal. This optimal securitisation strategy is discussed below, taking the example of an SME bank. First, the risk situation of the bank before securitisation is presented. Then both the risk effects and the profits effects of complete risk placement and optimal risk placement are measured and compared. The calculations on which the results are based were derived by applying a Monte Carlo model that is well established in the capital market.1
2 Situation Before Securitisation The analysis takes as its starting point a credit institution whose loan portfolio loss distribution before securitisation is shown in Fig. 1. It is assumed that this bank has a loan portfolio of 2.1 billion Euros. This portfolio has an average credit rating of “Ba1” to which a default probability of 0.8% per annum corresponds, which is typical for SMEs, and is 50% secured. These portfolio
Probability of loss severity
Expected loss
Unexpected loss resp. Capital commitment
20,0% Pool charateristics Pool charateristics
15,0%
Volume Avg. rating Avg. recovery rate Exp. loss (1 year)
10,0% 5,0% 0,0% 0,0%
0,5%
1,0%
1,5%
2,0%
2,5%
3,0%
2,1 bn Ba1 50% 0,4%
3,5%
4,0%
4,5%
Loss severity
Capital utilisation Volume Before securitisation 2.110.000.000
Expected loss 0,4%
9.114.585
99,97% quantile 3,9%
82.290.000
Capital commitment 3,5%
73.175.415
Fig. 1. The bank’s loss distribution before securitisation (one-year horizon) 1
The Monte Carlo model is based on a Gaussian Copula function and simulates the loss distribution which occurs for the credit institution in the scenarios and depicts the bank’s risk situation. To derive this loss distribution, it is assumed that the bank expects an average asset correlation with the loan portfolio of around 8%. This value is at the lower limit of the correlation assumptions used by Basel II to derive the risk weightings for small and medium-sized enterprises.
Asset Securitisation
207
characteristics produce an expected loss of 0.4% per annum. The unexpected loss or the economic risk that corresponds to the loan portfolio is derived, however, from the targeted solvency level and the composition of the bank’s portfolio. The targeted solvency level is first set at 99.97%, which complies with an “Aa” probability of default. Figure 1 shows the loss distribution of the loan portfolio before securitisation. This distribution attributes a probability of occurrence (ordinate) to each potential loss amount (abscissa). The mean of this distribution, which determines the value of the expected loss, is around 9.1 million Euros, i.e. 0.4% of the pool volume. The 99.97% quantile of the distribution is around 82 million Euros or 3.9% of the loan volume. The 99.97% quantile means that the bank needs to secure the loan portfolio with a total of 3.9% capital in order to achieve the targeted credit standing. As is customary in banking practice, it is assumed that this capital backing comprises 0.4 percentage point standard risk costs and 3.5 percentage points economic capital. It is assumed that the standard risk costs are fully included in the credit margins. This ensures that the expected loss on the loan portfolio, which corresponds to the standard risk costs, is borne by future margin income. By contrast, the economic capital must be held available by the credit institution. It is used to ensure that the institution remains solvent if unexpectedly high losses are incurred. For further analysis it is assumed that the credit institution requires 11% per annum return on the economic capital and is invariably in a position to enforce the required credit margins.
3 Securitisation Pool In order to measure the effects of a securitisation, it is presumed that the credit institution extracts a sub-portfolio worth 350 million Euros from its existing loan portfolio with the intention of securitising this sub-portfolio synthetically over a period of 5 years. The randomly selected portfolio also has an average credit rating of “Ba1” and 50% collateralisation. The loss distribution over the securitisation period for this portfolio and the ensuing tranching are shown in Fig. 2. The table in Fig. 2 shows that the pool has been subdivided into seven tranches. The first loss piece (FLP) accounts for 2.35% of the volume. The “Aaa” tranche accounts for 89% and is thus by far the largest securitisation tranche. In order to obtain as clear a distinction as possible between the placing of expected and unexpected losses when analysing the securitisation effects, the loan quality of the next tranche above the FLP has a very low “B3” rating. This ensures that the FLP consists mainly of expected losses. This can be seen, for example, from the relation between the expected losses of the securitisation pool over 5 years (1.8%) and the size of the FLP, which is around 76%.
Probability of loss severity
208
M. Schmidtchen 5,0% 4,5%
Pool characteristics Volume 350 m Avg. rating Ba1 Avg. recovery rate 50% Exp. loss (5 years) 1,80%
4,0% 3,5% 3,0% 2,5% 2,0% 1,5% 1,0% 0,5% 0,0% 0,0%
1,0%
2,0%
3,0%
4,0%
5,0%
6,0%
7,0%
8,0%
9,0%
Loss severity
Capital structure Rating Volume Spread p.a.
FLP 2,35% 25,00%
B3 1,15% 8,00%
Ba2 1,50% 2,50%
Baa2 1,75% 0,75%
A2 2,00% 0,50%
Aa2 2,25% 0,32%
Aaa 89,00% 0,10%
Fig. 2. Loss distribution of the securitisation pool (five-year horizon)
In addition to the sizes of the individual tranches, the table also shows the assumptions with regard to the spreads that the institution carrying out the securitisation has to pay to the capital market investors for assuming the risk. The spreads for categories “Aaa” to “Ba2” are based on observed SME securitisations. The price for the FLP or the “B3” tranche has be selected in such a way that, considering the expected losses of these tranches, the investor achieves a return on the investment of around 12% for the FLP and around 5% for the “B3” tranche. Similar prices can currently also be observed in the capital market. It is also expected that the securitisation generates transaction costs totalling some 1.3 million Euros. These costs include payments to the arranger, the rating agency, lawyers, etc, some of which have to be paid upfront while some are running fees.
4 Effects of Full Portfolio Securitisation The impact on the bank’s portfolio loss distribution after placing the entire securitisation pool is shown in Table 1. For the purpose of a comparative statical analysis, the risk situation before securitisation is also shown in Table 1. The table shows that after securitisation the bank’s total risk exposure is reduced by the amount of the placed volume. In addition, the relative expected loss increases marginally and the 99.97% quantile is now 4% of the remaining volume. Overall, this is accompanied by a 0.08 percentage point increase in the relative economic capital commitment. The slight increase in the relative economic capital can be attributed to the fact that, first, the quality of the loan portfolio retained by the bank has
Asset Securitisation
209
Table 1. The bank’s risk situation in case of full securitisation Capital utilisation (1-year view) Volume
Expected loss
99.97% quantile
Capital commitment
Before securitisation 2,110,000,000 0.43% 9,114,585 3.90% 82,290,000 3.47% 73,175,415 After securitisation 1,760,000,000 0.45% 7,900,933 4.00% 70,400,000 3.55% 62,499,067
worsened slightly. This can be seen from the increase in the relative expected loss and is due to the fact that the randomly selected securitised loans have a credit rating that is slightly above average. Second, the placing reduces the granularity of the bank’s loan portfolio, leading to an increase in the probability of extreme losses. More economic capital must be retained to cover this possibility. While the relative moments of the loss distribution increase slightly, both the absolute expected loss and the absolute capital commitment are obviously reduced. The impact of placing the entire risk on the bank’s returns can be seen in the simplified income statement presented below (see Table 2). The income statement lists the total profit and expenditure at the securitising institution over the entire securitisation transaction period. This approach is needed to illustrate the profitability of a securitisation covering several periods in full.2 The expenses generated by this securitisation strategy comprise transaction costs and capital market costs, which cover payments to the investors. On the profits side, the institution can record the reduction in expected loss. It was assumed that the expected loss or the standard risk costs is/are part of the credit margin. Owing to the placing of the expected loss, this flow of payments to the bank can now be considered entirely as profits. Further profits from securitisation are derived from freeing up economic capital. The amount of this variable is calculated by interest on the released economic capital (roughly 10.7 Euros) being paid at the target return on economic capital (11%) over the entire transaction period. If the income and expenditure sides are added together, total profits are roughly – 3.6 million Euros. Consequently, this securitisation strategy is not economical under the given assumptions.
2
In order to quantify the intertemporal effects of securitisation, some simplifying assumptions have been made. For example, it has been assumed that the risk reduction effects that are presented in Table 1 and that are calculated for the first year are also valid in the subsequent years. In addition, these future amounts are not discounted.
210
M. Schmidtchen Table 2. Profits on full risk placementa
Profits Reduction in expected loss Profits Economic capital release Total Total profits
Income statement (5-year view)
Expenditure
6,068,262 1,298,463 Transaction costs 9,906,372 Junior (FLP) Capital market costs 5,871,991 2,796,938 Mezzanine (Ba3 to Aa2) 1,557,500 Senior (Aaa) 11,940,253 15,559,272 −3, 619, 019
a
The individual amounts on the expenditure side are generated as follows. The transaction costs have been set by the author and are based on currently applicable values. The costs of the junior tranche occur on the assumption that the investor bears the full expected loss on the FLP which is roughly 5.4 million Euros and is fully compensated for this by the margin. In addition, he receives 12% per annum return on his investment. The costs of the remainder of the capital structure are derived by multiplying the nominal volume of the individual tranches by the respective spreads and the transaction period. Obviously, this is another simplification because effects of the expected losses within the tranches on the transaction costs are neglect. On the profits side, the reduction in the expected losses can be derived directly from Table 1. The reduction in Year 1 of some 1.2 million Euros is simply also multiplied by the transaction period. Much the same applies to the profits from releasing economic capital; the reduction in Year 1 is also multiplied by the transaction period. Then this amount is multiplied by the required return on economic capital of 11%
5 Effects of Optimal Portfolio Securitisation Within the model, securitisation is shown to be optimal when only the mezzanine part of the portfolio is issued on the capital market. The FLP and the “Aaa” part are retained by the credit institution. The risk situation arising from this strategy is illustrated in Table 3. The table shows that securitisation reduces the bank’s total risk exposure by the amount of the placed volume only. In addition, there is a marginal reduction in the expected loss. This can be attributed to the retention of the FLP, which accounts for by far the largest portion of the expected losses on the tranched portfolio. In contrast to the first moment of the loss distribution, there is a clear nominal and relative reduction in the 99.97% quantile, with the result that the capital commitment also decreases markedly and falls to around 64 million Euros. This value is only slightly above the nominal capital commitment when the securitisation pool is fully placed (see Table 1).3 3
It can be shown that this somewhat higher capital commitment can be attributed solely to the retention of the FLP. According to the model calculations, roughly
Asset Securitisation
211
Table 3. The bank’s risk situation when the FLP and the “Aaa” tranche are retained Capital utilisation (1-year view) Volume Before securitisation After securitisation
Expected loss
99.97% quantile
Capital commitment
2,110,000,000 0.43% 9,114,585 3.90% 82,290,000 3.47% 73,175,415 2.079.725.000 0.44% 9,101,805 3.50% 72,790,375 3.06% 63,688,570
Table 4. Profits in case of the retention of the FLP and the “Aaa” tranche Profits
Income statement (5-year view)
Expenditure
Reduction in expected loss 63,902 1,298,463 Transaction costs Profits Junior (FLP) Capital market costs Economic capital release 5,217,765 2,796,938 Mezzanine (Ba3 to Aa2) Senior (Aaa) Total
5,281,667 4,095,400
Total profits
1,186,267
Consequently, most of the economic risk has been placed with the mezzanine tranches and the retention of the senior tranches is not associated with any significant risk for the bank. This result is also intuitively plausible as the senior tranche has a higher credit rating than the bank VAR confidence is and hence by the time the senior tranche defaults, chances are the bank will already have defaulted. The income statement effect, which derives from this securitisation strategy, is presented in Table 4. Compared with Table 2, the income statement expenditure in Table 4 is reduced by the capital market costs that are saved by not placing the junior and senior parts of the portfolio. In addition to expenditure, profits also decline. However, the profits do not decline as strongly as the expenditure. The profits are reduced by roughly 6 million Euros by retaining the FLP or through the expected losses not placed. By contrast, costs amounting to some 9.9 million Euros are saved by not placing the FLP. This imbalance can be explained, inter alia, by the fact that an FLP investor – in contrast to the securitising institution – is invariably entitled to interest for assuming the expected losses. In this case, this is 12% per annum. This leads to the 66% of the FLP or 5.4 million Euros are expected losses, which from the bank’s perspective are not an economic risk. By contrast, the remainder of the FLP amounting to roughly 1.6 million Euros is a high risk position, almost all of which must be deducted from the economic capital.
212
M. Schmidtchen
conclusion that in the model used here it is not economically sound to place the FLP. The second central result is that placing the senior tranche is not economically sound. Retaining this tranche saves around 1.6 million Euros in expenditure. By contrast, the profits from released economic capital fall by comparison with full securitisation by roughly 0.7 million Euros only. Overall, the securitisation strategy of retaining the junior and senior risks yields positive profits of around 1.2 million Euros. The economic success of the securitisation strategy presented here is essentially driven by the bank’s targeted return on economic capital and the targeted solvency level. Whereas the targeted return on economic capital determines the opportunity costs of the use of capital, the solvency level affects the absolute amount of capital commitment for a specific credit risk. If the solvency level is set at the 99.97% quantile as in the calculations in the example, the total profits from the securitisation strategy rise the more the return on capital increases. This situation is shown by the solid line in Fig. 3. However, if the bank’s targeted solvency level falls from 99.97 to 99.90%, this corresponds roughly to an “A” rating the line in Fig. 3 shifts to the right. In this case, the break-even return on economic capital is roughly 11%, as opposed to roughly 8.5% in the previous case. This result can be attributed to the fact that the lower target credit rating leads to lower capital commitment before securitisation and hence to overall less economic capital being released through the securitisation. Accordingly, the return on economic capital must be higher to ensure that securitisation makes sense in terms of profits.4
5
Earnings (m)
4 3 2 1 0 −1
7%
8%
9%
10%
11%
12%
13%
14%
15%
16%
17%
−2
Return on Equity Retention of FLP & Senior (99,97%)
Retention of FLP & Senior (99,90%)
Fig. 3. Connection between return on capital, credit rating and profits
4
The same qualitative effect is achieved if the securitising bank anticipates an average asset correlation of less than 8% (see footnote 2). In this case, too, less economic capital would need to be retained before securitisation.
Asset Securitisation
213
6 Conclusion The analysis has shown that a credit institution which specialises in SMEs can significantly affect both the risk situation and the profits situation by using the instrument of portfolio securitisation. In the model presented above a portfolio strategy in which only the mezzanine part of a portfolio, which contains most of the economic risk, is securitised is shown to be particularly effective. Net profits from this strategy depend, inter alia, on the target credit rating of the securitising institution, which affects the absolute amount of the economic capitalisation. The better the target credit rating, the higher the capitalisation before securitisation and the higher the amount of economic capital released. In addition to the target credit rating, the return on economic capital required is a further important factor behind the profitability of a transaction of this kind. The return on economic capital required determines the opportunity costs of the capital utilisation. Together with the absolute amount of economic capital released, it thus determines the profits from the optimal securitisation strategy. The fact that the economic success of the securitisation strategy depends on the target credit rating and return on economic capital leads, in particular, to institutions that set themselves a high solvency level and which require a substantial return on economic capital can use the instrument of loan securitisation to optimise profits.
Recent Advances in Credit Risk Management Frances Cowell1 , Borjana Racheva2 , and Stefan Tr¨ uck3 1 2 3
Morley Fund Management, London, England, [email protected] FinAnalytica Inc., Sofia, Bulgaria, [email protected] School of Economics and Finance, Queensland University of Technology, Australia, [email protected]
1 Introduction In the last decade, the market for credit related products as well as techniques for credit risk management have undergone several changes. Financial crises and a high number of defaults during the late 1990s have stimulated not only public interest in credit risk management, but also their awareness of its importance in today’s investment environment. Also the market for credit derivatives has exhibited impressive growth rates. Active trading of credit derivatives only started in the mid 1990s, but since then has become one of the most dynamic financial markets. The dynamic expansion of the market requires new techniques and advances in credit derivative and especially dependence modelling among drivers for credit risk. Finally, the upcoming new capital accord (Basel II) encourages banks to base their capital requirement for credit risk on internal or external rating systems [4]. This regulatory body under the Bank of International Settlements (BIS) becoming effective in 2007 aims to strengthen risk management systems of international financial institutions. As a result, the majority of international operating banks sets focus on an internal-rating based approach to determine capital requirements for their loan or bond portfolios. Another consequence is that due to new regulatory requirements there is an increasing demand by holders of securitisable assets to sell or to transfer risks of their assets. Recent research suggests that while a variety of advances have been made, there are still several fallacies both in banks’ internal credit risk management systems and industry wide used solutions. As [15] point out, the use of the normal distribution for modelling the returns of assets or risk factors is not adequate since they generally exhibit heavy tails, excess kurtosis and skewness. All these features cannot be captured by the normal distribution. Also the notion of correlation as the only measure of dependence between risk factors or asset returns has recently been examined in empirical studies, for example [7]. Using the wrong dependence structure may lead to severe underestimation of the risk for a credit portfolio. The concept of copulas [13] allowing for more
216
F. Cowell et al.
diversity in the dependence structure between defaults as well as the drivers of credit risk could be a cure to these deficiencies. Further we suggest alternatives to the Value-at-Risk which is often suggested as the only risk measure to be considered. We also relate these considerations to the idea of a coherent measure of risk as introduced by [3]. Thus, the article extends the framework of risk management by diverting to expected tail loss (ETL) and advocates on the informational effectiveness of the former statistics. Finally, the quite dramatic effects of the business cycle on credit migration behavior have been investigated more thoroughly in recent years [2, 22]. Alternative and more adequate models suggest the use of conditional instead of average historical migration matrices for determining credit VaR. The rest of the paper is set up as follows. Section 2 provides insight into sound modelling of the returns of risk factors and assets using alternatives to the Gaussian distribution. Section 3 focuses on dependence modelling with the concept of copulas. The discussion continues in Section 4 where the necessity for using conditional migration matrices instead of average historical ones is illustrated. Section 5 extends the framework of risk management by diverting to expected tail loss (ETL). Section 6 describes how the features can be integrated in a credit risk management systems, Section 7 concludes.
2 Adequate Modelling of Market and Risk Factors In this section, we discuss how to generate scenarios for asset returns of the obligors or for changes in the market risk factors. The dynamic of financial risk factors is well known to often exhibit some of the following phenomena: heavy tails, skewness and high-kurtotic residuals. The recognition and description of the latter phenomena goes back to the seminal papers of [11] and [8]. To capture these features, we will introduce the α-stable distribution as an extension of the normal distribution. Due to its summation stability and the fact that it generalizes the Gaussian distribution, the class of stable distributions seems to be an ideal candidate to describe the return distribution of the considered risk factors. For an extensive description of the stable distribution and its application in financial theory see [17] or [15]. Let us first briefly review some of the main features of the stable distribution as the natural extension of the Gaussian distribution. An α-stable distributed random variable can be defined in the following way [17]: Definition 1 Let X be a random variable with stable distribution. The following theorem fully characterizes a random variable with stable distribution. X is a stable random variable if the following condition holds: Definition 2 A random variable X follows a stable distribution, if for any positive numbers A and B there exists a positive number C and a real number D such that (1) AX1 + BX2 = CX + D
Recent Advances in Credit Risk Management
217
where X1 and X2 are independent copies of X and ”=” denotes equality in distribution. The stable distribution can also be defined by its characteristic function: Definition 3 A random variable X has a stable distribution if there are parameters 0 < α ≤ 2, σ ≥ 0, −1 ≤ β ≤ 1, and µ real such that its characteristic function has the following form: ⎧ ⎨exp(−σ α |t|α [1 − iβsign(t) tan πα 2 ] + iµt), if α = 1, iXt (2) E(e ) = ⎩ if α = 1, exp(−σ|t|[1 + iβ π2 sign(t) ln |t|] + iµt), The family of stable distributions contains as a special case the Gaussian (Normal) distribution. However, non-Gaussian stable models do not possess the limitations of the normal one and all share a similar feature that differentiates them from the Gaussian one – heavy probability tails. Thus they can model greater variety of empirical distributions including skewed ones. The dependence of a stable random variable X from its parameters we will indicate by writing: X ∼ Sα (β, σ, µ) The parameters α, β, σ and µ of a stable Paretian distribution describe the stability, skewness, scale and drift and satisfy the following constraints: α is the index of stability (0 < α ≤ 2): for values of α lower than 2 the distribution is becoming more leptocurtic in comparison to the normal distribution. This means that the peak of the density becomes higher and the tails heavier. When α > 1, the location parameter µ is the mean of the distribution. β is the skewness parameter (−1 ≤ β ≤ 1): a stable distribution with β = µ = 0 is called a symmetric α-stable distribution (SαS). If β < 0, the distribution is skewed to the left, if β > 0, the distribution is skewed to the right. We conclude that the stable distribution can also capture asymmetric asset returns. σ is the scale parameter (σ ≥ 0): the scale parameter σ allows to write any stable random variable X as X = σX0 where X0 has a unit scale parameter and α and β are the same for X and X0 . µ is the drift (µ ∈ R): note that for 1 < α ≤ 2, the shift parameter µ equals the mean. Obviously the stable distribution offers more parameters to model empirically observed risk factors than e.g. the normal distribution. The word stable is used because the shape is preserved (apart from scale and shift) under addition such as in Equation 1. A very important advantage is that stable distributions form a family that contains the normal distribution as a special case. It is actually reduced to the Normal distribution if α = 2 and β = 0. Thus, most of the beneficial properties of the normal distribution which make it so popular within financial theory are also valid for the stable distributions:
218
• •
F. Cowell et al.
The sum of independent identically distributed (iid) stable random variables is again stable. This property allows us to build portfolios, for example. Stable distributions are the only distributional family that has its own domain of attraction - that is a large sum of i.i.d. random variables will have a distribution that converges to a stable one. This is a unique feature, which means that if a given stock price/rate is reflected by many small shocks, then the limiting distribution of the stock price can only be stable (that is Gaussian or non-Gaussian stable).
It is a widely accepted critique of the normal distribution that it fails to explain certain properties of financial variables - fat tails and excess kurtosis. Therefore, the stable distributions provide much more realistic models for financial variables which can capture the kurtosis and the heavy-tailed nature of financial data, see e.g [15]. Figure 1 illustrates the superior density fit of a stable (non-Gaussian) distribution in comparison to a Gaussian (normal) to the empirical distribution of the 1 week EURIBOR rate. Based on the superior fit to empirical data and the possibility to capture skewness, heavy tails and high-kurtotic residuals in the distribution the stable distribution has some advantages over the Gaussian model. Therefore, it should be favorable to assume that the probability model for asset returns and risk factors in credit risk modelling is described by the family of stable laws. 1 week
140
Kernel Estimate Stable Fit Gaussian Fit
120
100
80
60
40
20
0 −0.06
−0.04
−0.02
0
0.02
0.04
0.06
Fig. 1. Density fit of a Gaussian (normal), and stable (non-Gaussian) distributions to the empirical (sample) distribution of 1 week EURIBOR rate
220
F. Cowell et al.
X2 is identical in both models as well as their marginal distributions – X1 and X2 are normally distributed. Yet it is clear that the dependence structure of the two models is qualitatively different. If we interpret the random variables as financial loss, then adopting the first model could lead to underestimation of the probability of having extreme losses. On the contrary, according to the second model extreme losses have a stronger tendency to occur together. The example motivates the idea to model the dependence structure with a method more general than the correlation approach. The correlation is a widespread concept in modern finance and insurance and stands for a measure of dependence between two random variables. However, this term is very often incorrectly used to mean any notion of dependence. Actually, correlation is one particular measure of dependence among many. Of course in the world of multivariate normal distribution and, more generally in the world of spherical and elliptical distributions, it is the accepted measure. Yet empirical research shows that real data seldom seems to have been generated from a distribution belonging to this class. There are at least three major drawbacks of the correlation method. Let us therefore consider the case of two real-valued random variables X and Y : The variances of X and Y must be finite or the correlation is not defined. This assumption causes problems when working with heavy-tailed data. For instance the variances of the components of a bivariate t(n) distributed random vector for n ≤ 2 are infinite, hence the correlation between them is not defined. Independence of two random variables implies correlation equal to zero, the opposite, generally speaking, is not correct – zero correlation does not imply independence. A simple example is the following: Let X ∼ N (0, 1) and Y = X 2 . Since the third moment of the standard normal distribution is zero, the correlation between X and Y is zero despite the fact that Y is a function of X which means that they are dependent. Indeed, in the case of a multivariate normal distribution uncorrelatedness and independence are interchangeable notions. This statement is, however, not valid if only the marginal distributions are normal and the joint distribution is non-normal. The example on Fig. 2 illustrates this fact. The correlation is not invariant under non-linear strictly increasing transformations T : R → R. This is a serious disadvantage, since in general corr(T (X), T (Y )) = corr(X, Y ). A more prevalent approach is to model dependency using copulas [13]. Let us consider a real-valued random vector X = (X1 , . . . , Xn )t . The dependence structure of the random vector is completely determined by the joint distribution function F (x1 , . . . , xn ) = P (X1 ≤ x1 , . . . , Xn ≤ xn ).
(3)
Recent Advances in Credit Risk Management
221
It is possible to transform the distribution function and as a result to have a new function which completely describes the dependence between the components of the random vector and is not dependent on the marginal distributions. This function is called copula. Suppose we transform the random vector X = (X1 , . . . , Xn )t componentwise to have standard-uniform marginal distributions U (0, 1). Each random variable Xi has a marginal distribution of Fi that is assumed to be continuous for simplicity. Recall that the transformation of a continuous random variable X with its own distribution function F results in a random variable F (X) which is standardly uniformly distributed. Thus, transforming equation (3) component-wise yields F (x1 , . . . , xn ) = P (X1 ≤ x1 , . . . , Xn ≤ xn ) = P [F1 (X1 ) ≤ F1 (x1 ), . . . , Fn (Xn ) ≤ Fn (xn )] = C(F1 (x1 ), . . . , Fn (xn )),
(4)
where the function C can be identified as a joint distribution function with standard uniform marginals – the copula of the random vector X. In equation (4), it can be clearly seen, how the copula combines the marginals to the joint distribution. Sklar’s theorem provides a theoretic foundation for the copula concept [18]: Theorem 4 Let F be a joint distribution function with continuous margins F1 , . . . , Fn . Then there exists a unique copula C : [0, 1]n → [0, 1] such that for all x1 , . . . , xn in R = [−∞, ∞] (4) holds. Conversely, if C is a copula and F1 , . . . , Fn are distribution functions, then the function F given by (4) is a joint distribution function with margins F1 , . . . , Fn . For the case that the marginals Fi are not all continuous, it can be shown [18] that the joint distribution function can still be expressed like in equation (4). However, the copula C is no longer unique in this case. For risk management, the use of copulas offers the following advantages: • • •
The nature of dependency that can be modelled is more general. In comparison, only linear dependence can be explained by the correlation. Dependence of extreme events might be modelled. Copulas are indifferent to continuously increasing transformations (not only linear as it is true for correlations): If (X1 , . . . , Xn )t has a copula C and T1 . . . , Tn are increasing continuous functions, then (T1 (X1 ), . . . , Tn (Xn ))t also has the copula C.
The last statement may be quite important in asset-value models for credit risk, because this property postulates that the asset values of two companies shall have exactly the same copula as the stock prices of these two companies. The latter is true if we consider the stock price of a company as a call option on its assets and if the option pricing function giving the stock price is continuously increasing with respect to the asset values.
222
F. Cowell et al.
Overall, we conclude that the use of copulas as a more general measure of dependence has several advantages over the use of correlations only. Since especially in credit risk a nonlinear dependence structure between different risk factors, asset values and credit events may be assumed, the concept should be included in an adequate risk management approach.
4 Alternative Risk Measures Once the portfolio value scenarios are generated, an estimate for the distribution of the portfolio values can be obtained. We may then choose to report any number of descriptive statistics for this distribution. For example, mean and standard deviation could be obtained from the simulated portfolio values using sample statistics. However, because of the skewed nature of the portfolio distribution, the mean and standard deviation may not be good measures of risk. Since the distribution of values is not normal, it is not optimal to infer percentile levels from the standard deviation. Given the simulated portfolio values, we can compute better measures, for example empirical quantiles (VaR at different confidence levels), or expected shortfall (ES) and expected tail loss (ETL) risk statistics. The VaR framework, though well-established in the industry, has been subject to various criticism. In their seminal paper, [3] point out that the VaR concept has to be regarded with care and should not be the only concept for risk evaluation. Firstly, VaR creates severe aggregation problems and does not behave nicely with respect to the addition of risks, even if the risks are independent. Further, the use of value at risk does not consider diversification effects adequately. Hence, alternative risk measures should be considered as it comes to evaluation of portfolio credit risk. A more adequate measure of risk could be the conditional value-at-risk (CVaR), also known as total values-at-risk, expected shortfall or expected tail loss (ETL). It is defined as: 1 α V aRq (X)dq (5) ET Lα (X) = α 0 where V aRq (X) = − inf x {x|P (X ≤ x) ≥ α} is the VaR of the random variable X, interpreted as financial asset return, so −X is the loss, and ET Lα (X) = −E(X|X ≤ V aRα (X)) when we assume a continuous distribution for the distribution of X. ETL is defined as conditional loss, i.e. the average of the losses provided these are larger than the predicted VaR threshold at given confidence level. Thus compared to VaR which is a point estimate of risk, ETL reflects all information contained in the left tail of the asset returns probability distribution. This fact makes ETL a much more reliable and information effective risk statistics. Managing risk and/or optimizing portfolios on
Recent Advances in Credit Risk Management
223
the basis of ETL leads to higher risk-adjusted returns. ETL compared to VaR possesses a number of advantages, among others ETL is a smooth function which can be readily optimized. Moreover, ETL reflects only the downside and does not penalize for upside potential of the portfolio/asset returns, which is not true for the standard deviation. Recently, a variety of alternative risk measures has been introduced in the literature. One may also like to consider individual assets and to ascertain how much risk each asset contributes to the portfolio. Hence, also marginal (incremental) statistics should be considered. For an overview on desirable properties of a risk measure see for example [3], [19], [20] or [16].
5 Conditional Migration Behaviour It is generally agreed that assigned ratings and corresponding default probabilities but also the probabilities for rating changes are important determinants of a bank’s credit risk management. Unfortunately, due to cyclical behavior of the economy, credit spreads and migrations are not constant through time. [9] as well as [1] have shown that default rates and credit spreads clearly depend on the stage of the business cycle. [12] provided insight that probability transition matrices of bond ratings also vary with the state of the economy. Further investigating the issue, [22] show that such changes in migration or default behavior through time lead to substantial effects on risk figures for credit portfolios. Thus, to measure and forecast changes in migration behavior as well as determining adequate estimators for transition matrices can be considered as a major issue in rating based credit risk modelling. Still, despite the obvious importance of recognizing the impact of business cycles on rating transitions, the literature is sparse on this issue. The first model developed to explicitly link business cycles to rating transitions was in 1997 CreditPortfolioView (CPV) by [25] and McKinsey and Company. [6] as well as [10] use a one-factor model whereby ratings respond to business cycle shifts. The model is extended to a multifactor credit migration model by [23]. Finally, [12] propose an ordered probit model which permits migration matrices to be conditioned on the industry, country domicile and the business cycle. In this section we will summarize the main ideas for two of the approaches on adjusting migration matrices to the business cycle: The CreditPortfolioView Model (CPV) by [25] and factor models initially suggested by [6] and [10]. In the macro simulation approach by [25] a time series model for the business cycle is used to determine a conditional migration matrix. Let Yj,t be the macro-economic index for rating class j at time t. Then Yj,t is derived from a multi-factor time-series model of the form: Yj,t = βj,0 + βj,1 X1,t + βj,2 X2,t + ..... + βj,m Xm,t + vj,t .
(6)
224
F. Cowell et al.
According to the model the index Yj,t is dependent on economic variables Xk with k = 1, . . . , m where vj,t represents an error term. The error term vj,t is interpreted as the index innovation vector and assumed to be independent of the Xk,t and identically normally distributed, vj,t ∼ N (0, σj ) for every t, and independent for every j and we write vj ∼ N (0, Σv ). The macroeconomic factors Xk are assumed to follow an auto-regressive process of order 2 (AR2): Xk,t = γk,0 + γk,1 Xk,t−1 + γk,2 Xk,t−2 + ek,t .
(7)
Hereby, Xk,t−1 and Xk,t−2 denote the lagged values of the variable Xk , while ek,t denotes an error term that is assumed to be i.i.d, i.e. ek,t ∼ N (0, σe ). Obviously, based on parameter estimates for equations (6)-(7) a macroeconomic index also for future periods can be estimated. This index can be used to determine conditional default probabilities pj,t for rating class j in period t. The author suggests a logit model of the form: pj,t =
1 . 1 + e−Yj,t
(8)
while other models could be applied. Finally, for estimation of the conditional migration matrix a shifting procedure is used that redistributes the probability mass within each row of the unconditional migration matrix [24]. The shift operator is written in terms of a matrix S = {Sij } and the shift procedure is accomplished by Pcond = (I + τ S)Puncond
(9)
where τ denotes the amplitude of the shift in segment j and is a function of the estimated conditional default probability. For further conditions imposed on the factor τ we refer to [24]. Alternative models for adjustment of migration matrices to business cycle variables are approaches based on factor models including a systematic and idiosyncratic risk component [6, 10, 23]. In these approaches, a one-factor model is adopted to incorporate credit cycle dynamics into the transition matrix. First a so-called credit cycle index Zt defining the credit state based on macroeconomic conditions shared by all obligors during period t is estimated. The index is designed to be positive in good days and to be negative in bad days. A positive index implies a lower probability of default (PD) and downgrading probability but a higher upgrading probability and vice versa. To calibrate the index, PDs of speculative grade bonds are used, since often PDs of higher rated bonds are rather insensitive to the economic state, see e.g. [5]. Instead of the logit model suggested in [24], here a probit model is used. Further it is assumed that ratings transitions reflect an underlying, continuous credit-change indicator Y following a standard normal distribution.
Recent Advances in Credit Risk Management
225
This credit-change indicator is assumed to be influenced by both a systematic and unsystematic risk component. Therefore, Yt has a linear relationship with the systematic credit cycle index Zt and an idiosyncratic error term εt . Thus, the typical one-factor model parametrisation is obtained for the credit-change indicator: (10) Yt = ρZt + 1 − ρ2 εt . Since both Zt and εt are scaled to the standard normal distribution with the weights chosen to be ρ and 1 − ρ2 , Yt is also standard normal. Note that, ρ2 represents the correlation between the credit change indicator Yt and the systematic credit cycle index Zt . The probability distribution for the rating change for a company then takes place according to the outcome of the systematic risk index. To apply this scheme to a multi-rating system, it is assumed that conditional on an initial credit rating i at the beginning of a year, one partitions values of the credit change indicator Y into a set of disjoint bins. The bins are defined in a way that the probability of Yt falling in a given interval equals the corresponding historical average transition rate. This can be done simply by inverting the cumulative normal distribution function starting from the default column what is illustrated in Fig. 3. Using the bins calculated from the average transition matrix it is then straightforward to calculate the conditional transition probability on the credit cycle index. On average days one obtains Zt = 0 for the systematic risk index and the credit-change indicator Yt follows a standard normal distribution. A positive outcome of the credit cycle index Zt shifts the creditchange indicator to the right-hand side while in the case of a bad outcome of the systematic credit cycle index the distribution moves to the left hand side. Thus, in any year, the observed transition rates will deviate from the average migration matrix we have to find a shift such that the probabilities
Fig. 3. Corresponding credit scores to transition probabilities for a company with BBB rating (compare [6])
226
F. Cowell et al.
associated with the bins defined above best approximate the given year’s observed transition rates. The estimation problem then results in determining ρ such that the distance between the forecasted conditional transition matrix and empirically observed migrations is minimized, see e.g. [10]. Note that [23] extends the one-factor model representation by a multi-factor, Markov chain model for rating migrations and credit spreads. The different approaches point out the importance to incorporate business cycle effects into the estimation of credit migration matrices. [10], [12] and more recently [21] show that the conditional approach outperforms a naive approach of simply taking historical average or previous year’s transition matrices.
6 Integration of the Advanced Technologies in a Credit Risk Management System This section outlines the steps of the application of integrated credit and market risk management in practice. The introduction follows the implementation algorithm of these steps in the Cognity software system. The system basically incorporates two models for credit risk measurement – Asset Value Model (AVM) which is an extension of CreditMetrics and Stochastic Default Rate (SDR) model that serves as an enhancement to the McKinsey’s CreditPortfolioView, and integrates these in a general framework of risk management where the credit quality of the obligors is modelled dependent on the movements in market risk drivers as well. Most software systems on the market offer either market or credit risk measurement/management in separate products and it is a well known fact that their users consistently experience difficulties when trying to merge the results and build a comprehensive risk picture of the portfolio. 6.1 The Asset-Value Approach in Cognity Credit Risk There are four key steps in the Monte Carlo approach to credit risk modelling in the asset value model: Step 1. Modelling the dependence structure between market risk factors and the credit risk drivers. Step 2. Scenario Generation - each scenario corresponds to a possible “state of the world” at the end of the risk horizon which the portfolio risk is estimated for. For purposes of this article, the “state of the world” is just the credit rating of each of the obligors in the portfolio and the corresponding values of the market risk factors affecting the portfolio. Step 3. Portfolio valuation - for each scenario, the software evaluates the portfolio to reflect the new credit ratings and the values of the market risk factors. This step creates a large number of possible future portfolio values.
Recent Advances in Credit Risk Management
227
Step 4. Summarize results - having the scenarios generated in the previous steps, an estimate for the distribution of the portfolio value is produced. The user may then choose to report any number of descriptive statistics for this distribution. The general methodology described below is valid for every Monte Carlo approach to credit risk modelling in the AVM. We will now describe the improvements that have been introduced to the first two components of this class of models. Modelling the Dependence Structure Between the Market Risk Factors and the Credit Risk Drivers Under the asset value models, the general assumption is that the driver of credit events is the asset value of a company. The dependence structure between the asset values of two firms can be approximated by the dependence structure between the stock prices of those firms. In case there is no stock price information for a given obligor we employ the idea of segmentation described in CreditMetrics. The essence of this approach is that the user determines the percentage obligor volatility allocation among the volatilities of certain market indices and explains the dependence between obligors by the dependence of the market indices that drive obligors’ volatilities. As discussed in Sect. 3 modelling the dependence structure requires a greater flexibility than the one offered by the correlation concept. Hence, Cognity Credit Risk Module supplies flexible dependence structure models: • • •
A copula approach A subordinated model approach A simplified approach using correlations as a measure for the dependency for comparison purposes
A copula suitable for modelling dependencies between financial variables and credit drivers in particular, should be flexible enough to capture the dependence of extreme events and also asymmetries in dependence. There are few flexible multivariate copula functions which can be applied to largedimensional problems. Examples include the Gaussian copula (the one behind the multivariate Gaussian distribution), the multivariate Student’s t-copula, etc. Cognity utilizes a flexible copula model which contains the multivariate Student’s t-copula as a special case and allows for asymmetry in the dependence model as well as for dependence in the extreme events. The copula model is based on an asymmetric version of the multivariate Student’s t-distribution and is flexible enough for all market conditions including severe crises in which the asymmetric dependence is most pronounced. the subordinated approach arises from the so-called subordinated distributions. The symmetric stable distribution discussed in Section 2 is one representative of this class. In particular, a random variable X is said to be subordinated if its distribution allows the following stochastic representation:
228
F. Cowell et al.
X = Y ·Z, where Y is a positive random variable called subordinator, Z has a normal distribution and Y is independent of Z. In case X is a vector, Y and Z are vectors as well and the multiplication is defined as element-by-element. The subordinated models construct a rich and flexible class containing all random volatility models and can be extended to include skewed representatives. The concept of dependence within the subordinate models is introduced in the Gaussian component Z and the dependence between the components of the subordinators. This dependence model can be interpreted in the following way: The central part of the distribution is dominated by the Gaussian component and, therefore, is described by the covariance structure. The extreme events are triggered by the subordinators and, as a result, their dependence or independence is a consequence of the dependence or independence of the components of the vector of subordinators. Cognity system distinguishes between two categories - dependent subordinators and independent subordinators. Special cases of the dependent subordinators model are: multivariate Student’s t when the estimated degrees of freedom for all financial variables are the same; sub-Gaussian stable when all estimated indices of stability are the same; and of course, the multivariate Gaussian distribution which appears as a special case in both the dependent and independent subordinators cases. Scenario Generation In this section, we discuss how to simultaneously generate scenarios for future credit ratings of the obligors in the portfolio and for the changes in the market risk factors values. Each set of future credit ratings and market risk factors values corresponds to a possible ’state of the world’ at the end of our risk horizon. The scenario generation procedure under the Asset-Value model is as follows: 1. Establish asset-return thresholds for the obligors in the portfolio. The thresholds define migration from one credit rating to another. 2. Generate scenarios for asset returns and market risk factors values using an appropriate distribution - this is an assumption to be imposed. 3. Map the asset returns scenarios to credit ratings scenarios. As discussed in Sect. 2, a heavy-tailed model is needed to properly describe the behaviour of assets returns. Utilizing extended subordinated models the Cognity framework allows for selecting among several heavy-tailed distributional models: (1) the stable distributions discussed in Sect. 2, (2) the Student’s t-distribution. This is in fact a location- and scale-enhanced version of the Student’s t-distribution. It is a symmetric heavy-tailed distribution which allows for a subordinated representation. The normal distribution appears asymptotically as the ‘degrees of freedom’ parameter increases indefinitely, (3) the asymmetric student’s t-distribution. There are many ways to arrive
Recent Advances in Credit Risk Management
229
at an asymmetric version of the traditional Student’s t-distribution. We have selected an asymmetric version which allows for representation following an extension of the classical subordinated model of the form X = µ + γY + g(Y )Z
(11)
where µ and γ are constants and g : + → + is a function. Note that other classes of distributions which fit in the selected framework like the generalized hyperbolic distribution could be easily included in the framework. For further information we refer to [14] or [15]. If dependency is modelled using copulas then the marginal distribution can also follow any of the univariate forms of the distributions described above. Both the subordinated and the copula-based Cognity models allow for relatively easy generation of random samples. Once the scenarios for the asset values are generated, one only needs to assign credit ratings for each scenario. This is done by comparing the asset value in each scenario to the rating thresholds. Rating thresholds are estimated based on a migration matrix. Note that some of the conditional migration probability approaches discussed in Sect. 5 can be embedded in this model. Evaluation on the Portfolio Level and Summarizing the Results For non-default scenarios, the portfolio valuation step consists of applying a valuation model for each particular position within the portfolio over each scenario. The yield curve corresponding to the credit rating of the obligor for this particular scenario should be used. For default scenarios, a model for the recovery rates is required. As discussed in many empirical analyses recovery rates are not deterministic quantities but rather exhibit large variations. Such variation of value in the case of default is a significant contributor to risk. Recovery rates can be modelled using the Beta distribution with a specified mean and standard deviation. In this case, for each default scenario for a given obligor, we should generate a random recovery rate for each particular transaction with the defaulted obligor. The value of a given position in case a particular default scenario is realized will be different. Having the portfolio value scenarios generated in the previous steps, we obtain an estimate for the distribution of the portfolio values. We may then choose to report any set of descriptive statistics for this distribution. The calculation of statistics is the same for both Cognity models. For example, mean and standard deviation of future portfolio value can be obtained from the simulated portfolio values using sample statistics. Because of the skewed nature of the portfolio distribution, the mean and the standard deviation may not be good measures of risk. Given the simulated portfolio values, we can compute better measures, for example empirical quantiles, or Expected Tail Loss discussed in Sect. 4.
230
F. Cowell et al.
6.2 The Stochastic Default Rate Approach in Cognity Credit Risk Credit Risk Modelling based on Stochastic Modelling of Default Rate (SDR) approach comprises five key steps: 1. Build the econometric models for the default rates and for the explanatory variables. Default probability of a given segment is described based on an econometric model using explanatory variables such as macro-factors, indices, etc. It is fit using historical data for default frequencies in a given segment and historical time series for the explanatory variables. 2. Generate scenarios. Each scenario corresponds to a possible ‘state of the world’ at the end of our risk horizon. Here, the ‘state of the world’ is a set of values for the market variables and for the explanatory variable defined in step 1. 3. Estimate default probabilities for the segments under each scenario based on the scenario values for the explanatory variables and the model estimated in step 1. Then the migration matrix is adjusted. Simulate subscenarios for the status of each obligor. 4. Portfolio valuation. For each scenario, reevaluate the portfolio to reflect the new credit status of the obligor and the values of the market risk factors. This step generates a large number of possible future portfolio values. 5. Summarize results. Having the scenarios generated in the previous steps, we possess an estimate for the distribution of portfolio values. We may then choose to report any descriptive statistics for this distribution. The last two parts are the same for the Asset Value Model and for the Stochastic Default Rate Model, so we will concentrate on the first three components of the model. Building the Econometric Models Two models should be defined and estimated under the SDR approach: the first model provides an econometric approach for default probabilities of a segment based on explanatory variables like macro-factors, indices, etc. The second model deals with a time series approach for the explanatory variables. Default probability models are evaluated for each user-defined segment. The segment definitions can be flexible based on criteria like the credit rating, the industry, the region and the size of the company, provided that the time series of default rates are available for each of the segments. The explanatory variables that might be appropriate to represent the systematic risk of the default rates in the chosen country/industry/segment depend on the nature of the portfolio and might comprise industry indices, macro variables (GDP, unemployment rate) as well as long-term interest rates or exchange rates, etc. When defining the model for the default probability of a segment based on explanatory variables (macro-factors, indices, etc.) we use historical data
Recent Advances in Credit Risk Management
231
for default frequencies in a given segment and historical time series for the explanatory variables. The idea is similar to the CreditPortfolioView described in Sect. 5. Hereby, a function f is chosen and estimated such that DFs,t = f (X1,t , . . . , XN,t ) + ut
(12)
where DFs,t is the default frequency in the segment s for the time period t; Xi,t is the value of the i-th explanatory variable at time t, i = 1, . . . , N . It should be mentioned that in general explanatory variables can be observable factors but also factors estimated by the means of fundamental factor analysis based on stock returns in a given segment or latent variables coming from statistical factor models. The second model is a time-series model for the explanatory variables. The usual way to model dependent variables (as suggested also in CreditPortfolioView) is to employ some kind of ARM A(p, q) model. That is the same as assuming that Xt = a0 +
p i=1
ai Xt−i +
q
bj εt−i + εt ,
(13)
j=1
where et is εt ∼ N (0, σ 2 ). It is important to note that a sound modelling of the default rate will depend very much on the proper modelling of the dependent variables. There are numerous empirical studies showing that the real distribution of residuals deviates from the assumption of the model - residuals are not normal. They are usually skewed, with fatter tails and volatility clustering. Thus the improper use of normal residuals may end up with ‘incorrect’ scenarios (simulations) for the possible default rates. For additional information, see e.g. [14] or [15]. For the modelling of macro-factors, Cognity system proposes the following more general Vector-AR(1)-GARCH type model with heavy-tailed residuals. The model takes the following form: Xt = A1 Xt−1 + Et
(14)
where Xt = (X1,t , . . . , Xn,t ) is the vector of explanatory variables, A1 is an n × n matrix and Et = (ε1,t , . . . , εn,t ) is the vector of residuals which are modelled by a multivariate heavy-tailed GARCH-type model. The Monte Carlo Approach in the SDR Model There are five key steps in the Monte Carlo approach to credit risk modelling based on stochastic modelling of the default rate: Step 1. Build econometric models for default rates and explanatory risk variables. Based on the explanatory risk variables (macro-factors, indices, etc.)
232
F. Cowell et al.
an econometric model for the default probability for each segment is fitted using historical data for default probabilities in a given segment and historical time series data for the explanatory variables. Step 2. Generate scenarios – each scenario corresponds to a possible “state of the world” at the end of the risk horizon. Here, the “state of the world” is a set of values for the market and explanatory risk variables defined. Step 3. Estimate default probabilities under each scenario for each segment using the scenario (simulation) values of the explanatory variables and the model estimated in step 1. Sample a new default rate for each obligor and adjust the respective migration probabilities based on the new default rate. Determine the credit rating status of each obligor based on the new migration and default probabilities. Technically this is accomplished by making use of a uniform (0,1) random variable, which is drawn for each counterparty and each simulation of the default rate. Step 4. Portfolio valuation – for each scenario, revalue the portfolio to reflect the new credit status of the obligor and the values of the market risk factors. This step generates a large number of possible future portfolio values. Step 5. Summarize results – once the scenarios in the previous steps are generated, we come up with an estimate for the distribution of portfolio values. We may then choose to report any descriptive statistics for this distribution.
7 Conclusion In this paper we reviewed recent advances in credit risk management. The upcoming new Basel capital accord, recent periods of high default rates and the substantial growth in the credit derivatives markets, have led to a high awareness of necessary improvements in credit risk modelling. We provided an overview of the most common fallacies embedded in several industry wide used solutions. The first concept under criticism was the use of the normal distributions to model asset returns. Recent research by e.g. Rachev and Mittnik (2000) suggests that the use of the normal distribution for modelling the returns of an asset or macroeconomic risk factors is not adequate. The normal distribution cannot capture important features like heavy tails, excess kurtosis and skewness exhibited by the variables. We also reviewed the deficiencies of the use of correlation as dependence measure between risk factors or asset returns (Embrechts et al, 2001). We argue that using the wrong dependence structure may lead to severe underestimation of the risk for a credit portfolio and recommend the use of copulas (Sklar, 1959) as alternative concept. Copulas allow for more diversity in the dependence structure between defaults as well as the drivers of credit risk and should be incorporated in advanced credit risk management systems. Further we suggested the additional use of alternative risk measures next to the industry standard of Value-at-Risk. The idea of a coherent risk measures, initially introduced by Artzner et al (1999) provides theorems for construction
Recent Advances in Credit Risk Management
233
of more adequate measures. Hence, we propose to consider not only a single quantile of the loss distribution of a credit portfolio, but to include several risk measures including expected shortfall (ES) and expected tail loss (ETL). Finally, as a result of the quite dramatic effects of the business cycle on credit migration behaviour, we point out the importance of using conditional instead of historical average migration matrices. In recent years, research and empirical studies (e.g. Allen and Saunders, 2003; Tr¨ uck and Rachev, 2005) suggest that business cycle effects have substantial impact on CVaR and should not be ignored. We propose different methods that can be used for the estimation of conditional transition matrices. Finally, a case study, using the FinAnalytica Inc. Cognity software system, provides some information how the discussed features can be incorporated in a up-to-date credit risk management system. Hereby, two different classes of credit risk models are considered - an extension of the classic Asset Value Model (AVM) and an advanced Stochastic Default Rate (SDR) model.
Acknowledgement The authors are grateful to Georgi Mitov (FinAnalytica) and Dobrin Penchev (FinAnalytica) for the fruitful comments, suggestions and computational assistance. We also thank Zari Rachev (University of Karlsruhe, UCSB and FinAnalytica) and Stoyan Stoyanov (FinAnalytica) for helpful discussions.
References [1] Alessandrini, F., 1999. Credit Risk, Interest Rate Risk, and the Business Cycle. Journal of Fixed Income 9 (2), 42–53. [2] Allen, L., Saunders, A., 2003. A Survey of Cyclical Effects in Credit Risk Measurement Model. BIS Working Paper 126. [3] Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., 1999. Coherent measures of risk. Mathematical Finance 9 (3), 203–228. [4] Basel Committee on Banking Supervision, 2001. The new Basel Capital Accord, Second Consultative Document. [5] Belkin, B., Forest, L., Suchower, S., 1998. The Effect of Systematic Credit Risk on Loan Portfolio Value-at-Risk and Loan Pricing. CreditMetrics Monitor. [6] Belkin, B., Forest, L., Suchower, S., 1998. A one-parameter Representation of Credit Risk and Transition Matrices. CreditMetrics Monitor. [7] Embrechts, P., McNeil, A., Straumann, D., 1999. Correlation and Dependence in Risk Management: Properties and Pitfalls. In: Risk management: value at risk and beyond, ed. Dempster, M. [8] Fama, E., 1965. The Behaviour of Stock Market Prices. Journal of Business 38, 34–105.
234
F. Cowell et al.
[9] Helwege, J., Kleiman, P., 1997. Understanding aggregate default rates of high-yield bonds. Journal of Fixed Income 7(1), 55–61. [10] Kim, J., 1999. Conditioning the Transition Matrix. Risk Credit Risk Special Report, 37–40. [11] Mandelbrot, B., 1963. The Variation of certain speculative Prices. Journal of Business 36, 394–419. [12] Nickell, P., Perraudin, W., Varotto, S., 2000. Stability of Rating Transitions. Journal of Banking and Finance 1-2, 203–227. [13] Picone, D., 1959. Fonctions de r´epartition a` n dimensions et leurs marges. Working Paper, Cass Business School 8, 229–231. [14] Rachev, S., Martin, R., Racheva, B., Stoyanov, S., 2006. Stable ETL Portfolios and Extreme Risk Management. Working Paper. [15] Rachev, S., Mittnik, S., 2000. Stable Paretian Models in Finance. Wiley, New York. [16] Rachev, S., Ortobelli, S., Stoyanov, S., Fabozzi, F., Biglova, A., 2006. Desirable Properties of an Ideal Risk Measure in Portfolio Theory. Working Paper. [17] Samorodnitsky, G., Taqqu, M., 1994. Stable Non-Gaussian Random Processes. Chapman & Hall, New York. [18] Schweizer, B., Sklar, A., 1983. Probabilistic Metric Spaces. North Holland Elsevier, New York. [19] Szeg¨o, G., 2002. Measures of Risk. Journal of Banking and Finance 26(7), 1253–1272. [20] Szeg¨o, G., 2004. Risk Measures for the 21st Century. Wiley, Chichester. [21] Tr¨ uck, S., 2008. Forecasting Credit Migration Matrices with Business Cycle Effects - A Model Comparison. European Journal of Finance 14(5), 359–379. [22] Tr¨ uck, S., Rachev, S., 2005. Credit Portfolio Risk and PD Confidence Sets through the Business Cycle. Journal of Credit Risk 1(4). [23] Wei, J., 2003. A Multi-Factor, Credit Migration Model for Sovereign and Corporate Debts. Journal of International Money and Finance 22, 709–735. [24] Wilson, T., 1997. Measuring and Managing Credit Portfolio Risk. McKinsey & Company. [25] Wilson, T., 1997. Portfolio Credit Risk I/II. Risk 10.
Stable ETL Optimal Portfolios and Extreme Risk Management Svetlozar T. Rachev1 , R. Douglas Martin2 , Borjana Racheva3 , and Stoyan Stoyanov4 1 2 3 4
FinAnalytica FinAnalytica FinAnalytica FinAnalytica
Inc., Inc., Inc., Inc.,
Sofia, Bulgaria, [email protected] Seattle WA, USA, [email protected] Sofia, Bulgaria, [email protected] Sofia, Bulgaria, [email protected]
1 Introduction We introduce a practical alternative to Gaussian risk factor distributions based on Svetlozar Rachev’s work on Stable Paretian Models in Finance (see [4]) and called the Stable Distribution Framework. In contrast to normal distributions, stable distributions capture the fat tails and the asymmetries of real-world risk factor distributions. In addition, we make use of copulas, a generalization of overly restrictive linear correlation models, to account for the dependencies between risk factors during extreme events, and multivariate ARCH-type processes with stable innovations to account for joint volatility clustering. We demonstrate that the application of these techniques results in more accurate modeling of extreme risk event probabilities, and consequently delivers more accurate risk measures for both trading and risk management. Using these superior models, VaR becomes a much more accurate measure of downside risk. More importantly Stable Expected Tail Loss (SETL) can be accurately calculated and used as a more informative risk measure for both market and credit portfolios. Along with being a superior risk measure, SETL enables an elegant approach to portfolio optimization via convex optimization that can be solved using standard scalable linear programming software. We show that SETL portfolio optimization yields superior risk adjusted returns relative to Markowitz portfolios. Finally, we introduce an alternative investment performance measurement tools: the Stable Tail Adjusted Return Ratio (STARR), which is a generalization of the Sharpe ratio in the Stable Distribution Framework. “When anyone asks me how I can describe my experience of nearly 40 years at sea, I merely say uneventful. Of course there have been winter gales and storms and fog and the like, but in all my experience, I have never been in an accident of any sort worth speaking about. I have seen but one vessel in
236
S.T. Rachev et al.
distress in all my years at sea (...) I never saw a wreck and have never been wrecked, nor was I ever in any predicament that threatened to end in disaster of any sort.” E.J. Smith, Captain, 1907, RMS Titanic
2 Extreme Asset Returns Demands New Solutions Professor Paul Wilmott (www.wilmott.com) likes to recount the ritual by which he questions his undergraduate students on the likelihood of Black Monday 1987. Under the commonly accepted Gaussian risk factor distribution assumption, they consistently reply that there should be no such event in the entire existence of the universe and beyond! The last two decades have witnessed a considerable increase in fat-tailed kurtosis and skewness of asset returns at all levels, individual assets, portfolios and market indices. Extreme events are the corollary of the increased kurtosis. Legacy risk and portfolio management systems have done a reasonable job at managing ordinary financial events. However up to now, very few institutions or vendors have demonstrated the systematic ability to deal with the unusual or extreme event, the one that should almost never happen using conventional modeling approaches. Therefore, one can reasonably question the soundness of some of the current risk management practices and tools used in Wall Street as far as extreme risk is concerned. The two main conventional approaches to modeling asset returns are based either on a historical or a normal (Gaussian) distribution for returns. Neither approach adequately captures unusual asset price and return behaviors. The historical model is bounded by the extent of the available observations and the normal model inherently cannot produce atypical returns. The financial industry is beleaguered with both under-optimized portfolios with often-poor ex-post risk-adjusted returns, as well as overly optimistic aggregate risk indicators (e.g. VaR) that lead to substantial unexpected losses. The inadequacy of the normal distribution is well recognized by the risk management community. Yet up to now, no consistent and comprehensive alternative has adequately addressed unusual returns. To quote one major vendor: “It has often been argued that the true distributions returns (even after standardizing by the volatility) imply a larger probability of extreme returns than that implied from the normal distribution. Although we could try to specify a distribution that fits returns better, it would be a daunting task, especially if we consider that the new distribution would have to provide a good fit across all asset classes.” (Technical Manual, RMG, 2001, http://www.riskmetrics.com/publications/ index.html).
Stable ETL Optimal Portfolios and Extreme Risk Management
237
In response to the challenge, we use generalized multivariate stable (GMstable) distributions and generalized risk-factor dependencies, thereby creating a paradigm shift to consistent and uniform use of the most viable class of non-normal probability models in finance. This approach leads to distinctly improved financial risk management and portfolio optimization solutions for assets with extreme events.
3 The Stable Distribution Framework 3.1 Stable Distributions In spite of wide-spread awareness that most risk factor distributions are heavytailed, to date, risk management systems have essentially relied either on historical, or on univariate and multivariate normal (or Gaussian) distributions for Monte Carlo scenario generation. Unfortunately, historical scenarios only capture conditions actually observed in the past, and in effect use empirical probabilities that are zero outside the range of the observed data, a clearly undesirable feature. On the other hand Gaussian Monte Carlo scenarios have probability densities that converge to zero too quickly (exponentially fast) to accurately model real-world risk factor distributions that generate extreme losses. When such large returns occur separately from the bulk of the data they are often called outliers. Figure 1 below shows quantile–quantile (qq)-plots of daily returns versus the best-fit normal distribution of nine randomly selected microcap stocks for the two-year period 2000–2001. If the returns were normally distributed, the quantile points in the qq-plots would all fall close to a straight line. Instead they all deviate significantly from a straight line (particularly in the tails), reflecting a higher probability of occurrence of extreme values than predicted by the normal distribution, and showing several outliers. Such behavior occurs in many asset and risk factor classes, including wellknown indices such as the S&P 500, and corporate bond prices. The latter are well known to have quite non-Gaussian distributions that have substantial negative skews to reflect down-grading and default events. For such returns, non-normal distribution models are required to accurately model the tail behavior and compute probabilities of extreme returns. Various non-normal distributions have been proposed for modeling extreme events, including: • • • • •
Mixtures of two or more normal distributions t-distributions, hyperbolic distributions, and other scale mixtures of normal distributions Gamma distributions Extreme value distributions Stable non-Gaussian distributions (also known as L´evy-stable and Paretostable distributions)
0.00 0.05 0.10
AER.returns NPSI.returns
−0.2 −0.1 0.0 0.1 0.10
IQW.returns
0.00
0.1 −0.1
CMED.returns
−0.3 0.6
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
−0.10
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal ALLE.returns
WFHC.returns
−0.4 −0.2 0.0 0.2 0.4
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
0.2
CVST.returns
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
−0.2
0.00 −0.10
AXM.returns
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
−0.1 0.0 0.1 0.2
0.0
0.2
S.T. Rachev et al.
−0.2
VWKS.returns
238
−3 −2 −1 0 1 2 3 Quantiles of Standard Normal
Fig. 1. Quantile–quantile (qq)-plots versus the best-fit normal distribution
Among the above, only stable distributions have attractive enough mathematical properties to be a viable alternative to normal distributions in trading, optimization and risk management systems. A major drawback of all alternative models is their lack of stability. Benoit Mandelbrot [3] demonstrated that the stability property is highly desirable for asset returns. These advantages are particularly evident in the context of portfolio analysis and risk management. An attractive feature of stable models, not shared by other distribution models, is that they allow generation of Gaussian-based financial theories and, thus allow construction of a coherent and general framework for financial modeling. These generalizations are possible only because of specific probabilistic properties that are unique to (Gaussian and non-Gaussian) stable laws, namely: the stability property, the central limit theorem, and the invariance principle for stable processes. Benoit Mandelbrot [3], then Eugene Fama [2], provided seminal evidence that stable distributions are good models for capturing the heavy-tailed (leptokurtic) returns of securities. Many follow-on studies came to the same conclusion, and the overall stable distributions theory for finance is provided in the definitive work of Rachev and Mittnik [4], see also [5, 6, 9].
Stable ETL Optimal Portfolios and Extreme Risk Management
239
But in spite the convincing evidence, stable distributions have seen virtually no use in capital markets. There have been several barriers to the application of stable models, both conceptual and technical: • • •
Except for three special cases, described below, stable distributions have no closed form expressions for their probability densities. Except for normal distributions, which are a limiting case of stable distributions (with α=2 and β = 0, stable distributions have infinite variance and only a mean value for α > 1. Without a general expression for stable probability densities, one cannot directly implement maximum likelihood methods for fitting these densities, even in the case of a single (univariate) set of returns.
The availability of practical techniques for fitting univariate and multivariate stable distributions to asset and risk factor returns has been the barrier to the progress of stable distributions in finance. Only the recent development of advanced numerical methods has removed this obstacle. These patent-protected methods are at the foundation of the CognityTM risk management and portfolio optimization software system (see further comments in Sect. 5.6). Univariate Stable Distributions A stable distribution for a random risk factor X is defined by its characteristic function: ) itX * = eitx fµ,σ (x)dx, F (t) = E e where fµ,σ (x) =
1 f σ
x−µ σ
is any probability density function in a location-scale family for X: ) ** α) + iµt, α = 1 −σ α |t| ) 1 − iβsgn(t) tan πα 2 * log F (t) = −σ |t| 1 − iβ π2 sgn(t) log |t| + iµt, α = 1 A stable distribution is therefore determined by the four key parameters: 1. 2. 3. 4.
α determines density’s kurtosis with 0 < α ≤ 2 (e.g. tail weight) β determines density’s skewness with −1 ≤ β ≤ 1 σ is a scale parameter (in the Gaussian case, α = 2 and 2σ 2 is the variance) µ is a location parameter (µ is the mean if 1 < α ≤ 2)
Stable distributions for risk factors allow for skewed distributions when β = 0 and fat tails relative to the Gaussian distribution when α < 2 The graph in Fig. 2 shows the effect of α on tail thickness of the density as well as peakedness at the origin relative to the normal distribution (collectively the “kurtosis” of the density), for the case of β = 0, µ = 0, and σ = 1. As the
240
S.T. Rachev et al.
Symmetric PDFs 0,7 0,6 0,5 α=0.5
0,4
α=1 α=1.5
0,3
α=2 0,2 0,1 0 −5
−4
−3
−2
−1
0
1
2
3
4
5
Fig. 2. Symmetric stable densities
values of α decrease the distribution exhibits fatter tails and more peakedness at the origin. The √ case of α = 2 and β = 0 and with the reparameterization in scale, σ = 2σ, yields the Gaussian distribution, whose density is given by: fµ,σ (x) = √
(x−µ)2 1 e− 2σ 2 . 2π σ
The case α=1 and β = 0 yields the Cauchy distribution with much fatter tails than the Gaussian, and is given by: 1 fµ,σ (x) = π·σ
1+
x−µ σ
2 !−1
Figure 3 below illustrates the influence of β on the skewness of the density for α=1.5, µ=0 and σ=1. Increasing (decreasing) values of β result in skewness to the right (left). Fitting Stable and Normal Distributions: DJIA Example Aside from the Gaussian, Cauchy, and one other special case of stable distribution for a positive random variable with α = 0.5, there is no closed form expression for the probability density of a stable random variable. Thus one is not able to directly estimate the parameters of a stable distribution by the method of maximum likelihood. To estimate the four parameters
Stable ETL Optimal Portfolios and Extreme Risk Management
241
Skewed PDFs (α=1.5) 0,35 0,3 0,25 β=0 β=0.25
0,2
β=0.5 β=0.75
0,15
β=1 0,1 0,05 0
−5
−4
−3
−2
−1
0
1
2
3
4
5
STABLE DENSITY NORMAL DENSITY EMPIRICAL DENSITY
0.5
1.0
1.5
DJIA DAILY RETURNS
0.0
TAIL PROBABILITY DENSITIES
2.0
Fig. 3. Skewed stable densities
−0.08
−0.07
−0.06
−0.05
−0.04
−0.03
DJIA DAILY RETURNS Fig. 4. DJIA daily returns from January 1, 1990 to February 14, 2003
of the stable laws, the CognityTM system uses a special patent-pending version of the FFT (Fast Fourier Transform) approach to numerically calculate the densities with high accuracy, and then applies maximum likelihood estimation (MLE) to estimate the parameters. The results from applying the CognityTM stable distribution modeling to the DJIA daily returns from January 1, 1990 to February 14, 2003 is displayed in Fig. 4. The figure shows the left-hand tail detail of the resulting stable density, along with that of a normal density fitted using the sample mean and sample standard deviation, and that of a non-parametric kernel density estimate (labeled “Empirical” in the plot legend). The parameter estimates are:
242
• •
S.T. Rachev et al.
Stable parameters α ˆ = 1.699, βˆ = −0.120, µ ˆ = 0.0002, and σ ˆ = 0.006, Normal density parameter estimatesˆ µ = 0.0003, and σ ˆ = 0.010.
Note that the stable density tail behavior is reasonably consistent with the empirical non-parametric density estimate, indicating the existence of some extreme returns. At the same time it is clear from the figure that the tail of the normal density is much too thin, and will provide inaccurate estimates of tail probabilities for the DJIA returns. The table below shows just how bad the normal tail probabilities are for several negative returns values. Probability (DJIA Return < x ) x
−0.04
−0.05
−0.06
−0.07
Stable fit Normal fit
0.0066 0.000056
0.0043 0.0000007
0.0031 3.68E−09
0.0023 7.86E−12
A daily return smaller than −0.04 with the stable distribution occurs with probability 0.0066, or roughly seven times every four years, whereas such a return with the normal fit occurs on the order of once every four years. Similarly, a return smaller than −0.05 with the stable occurs about once per year and with the normal fit about once every 40 years. Clearly the normal distribution fit is an exceedingly optimistic predictor of DJIA tail return values. Figure 5 below displays the central portion of the fitted densities as well as the tails, and shows that the normal fit is not nearly peaked enough near DJIA Empirical Density Stable Fit Gaussian (Normal) Fit
50
40
30
20
10
0 −0.08
−0.06
−0.04
−0.02
0
0.02
0.04
0.06
0.08
Fig. 5. The fitted stable and normal densities together with the empirical density
Stable ETL Optimal Portfolios and Extreme Risk Management
243
the origin as compared with the empirical density estimate (even though the GARCH model was applied), while the stable distribution matches the empirical estimate quite well in the center as well as in the tails. Fitting Stable Distributions: Micro-Caps Example Noting that micro-cap stock returns are consistently strongly non-normal (see sample of normal qq-plots at the beginning of this section), we fit stable distributions to a random sample of 182 micro-cap daily returns for the twoyear period 2000–2001. The results are displayed in the box plot in Fig. 6. The median of the estimated alphas is 1.57, and the upper and lower quartiles are 1.69 and 1.46 respectively. Somewhat surprisingly, the distribution of the estimated alphas turns out to be quite normal. Generalized Multivariate Stable Distribution Modeling Generalized stable distribution (GMstable) modeling is based on fitting univariate stable distributions for each one dimensional set of returns or risk factors, each with its own parameter estimates αi , βi , µi , σi, i=1,2,. . . ,K, where K is the number of risk factors, along with a dependency structure. One way to produce the cross-sectional dependency structure is through a scale mixing process (called a “subordinated” process in the mathematical finance literature) as follows. First compute a robust mean vector and covariance matrix estimate of the risk factors to get rid of the outliers, and have a good covariance matrix estimate for the central bulk of the data. Next we generate multivariate normal scenarios with this mean vector and covariance matrix. Then we multiply each of random variable component of the scenarios ESTIMATED ALPHAS OF 182 MICRO-CAP STOCKS
1.1
1.3
1.5
1.7
ESTIMATED ALPHAS
Fig. 6. A box-plot of estimated alphas
1.9
244
S.T. Rachev et al.
by a strictly positive stable random variable with index αi /2, i=1,2,. . . ,K. The vector of stable random variable scale multipliers is usually independent of the normal scenario vectors, but it can also be dependent. See for example Rachev and Mittnik [4], and [5, 6, 9]. Another very promising approach to building the cross-sectional dependence model is through the use of copulas, an approach that is quite attractive because it allows for modeling higher correlations during extreme market movements, thereby accurately reflecting lower portfolio diversification at such times. The next section briefly discussion copulas. 3.2 Copula Multivariate Dependence Models Why Copulas? Classical correlations and covariances are quite limited measures of dependence, and are only adequate in the case of multivariate Gaussian distributions. A key failure of correlations is that, for non-Gaussian distributions, zero correlation does not imply independence, a phenomenon that arises in the context of time-varying volatilities represented by ARCH and GARH models. The reason we use copulas is that we need more general models of dependence, ones which: • •
Are not tied to the elliptical character of the multivariate normal distribution. Have multivariate contours and corresponding data behavior that reflect the local variation in dependence that is related to the level of returns, in particular, those shapes that correspond to higher correlations with extreme co-movements in returns than with small to modest co-movements.
What are Copulas? A copula may be defined as a multivariate cumulative distribution function with uniform marginal distributions: C(u1 , u2 , · · · , un ), ui ∈ [0, 1] for i = 1, 2, · · · , n where C(ui ) = ui for i = 1, 2, · · · , n. It is known that for any multivariate cumulative distribution function: F (x1 , x2 , · · · , xn ) = P (X1 ≤ x1 , X2 ≤ x2 , · · · Xn ≤ xn ) there exists a copula C such that F (x1 , x2 , · · · , xn ) = C(F1 (x1 ), F2 (x2 ), · · · , Fn (xn ))
Stable ETL Optimal Portfolios and Extreme Risk Management
245
where the Fi (xi ) are the marginal distributions of F (x1 , x2 , · · · , xn ), and conversely for any copula Cthe right-hand-side of the above equation defines a multivariate distribution function F (x1 , x2 , · · · , xn ). See for example, Bradley and Taqqu [1] and Sklar [8]. The main idea behind the use of copulas is that one can first specify the marginal distributions in whatever way makes sense, e.g. fitting marginal distribution models to risk factor data, and then specify a copula C to capture the multivariate dependency structure in the best suited manner. There are many classes of copula, particularly for the special case of bivariate distributions. For more than two risk factors beside the traditional Gaussian copula, the t-copula is very tractable for implementation and provides a possibility to model dependencies of extreme events. It is defined as: Γ ((ν + n)/2) Cν,c (u1 , u2 , · · · , un ) = Γ (ν/2) |c| (νπ)n
t−1 ν (u1 )
−∞
···
···
t−1 ν (un )
−∞
! −1 s c s 1+ ds ν
where c is a correlation matrix. A sample of 2,000 bivariate simulated risk factors generated by a t-copula with 1.5 degrees of freedom and normal marginal distributions is displayed in Fig. 7. The example illustrates that these two risk factors are somewhat uncorrelated for small to moderately large returns, but are highly correlated for the
1 0 −1 −3
−2
RISK FACTOR 2
2
3
T-COPULA WITH 1.5 DOF AND NORMAL MARGINALS
−2
0
2
RISK FACTOR 1
Fig. 7. Bivariate simulations obtained by using t-copula
246
S.T. Rachev et al.
infrequent occurrence of very large returns. This can be seen by noting that the density contours of points in the scatter plot are somewhat elliptical near the origin, but are nowhere close to elliptical for more extreme events. This situation is in contrast to a Gaussian linear dependency relationship where the density contours are expected to be elliptical. 3.3 Volatility Clustering Models and Stable VaR It is well known that asset returns and risk factors returns exhibit volatility clustering, and that even after adjusting for such clustering the returns will still be non-normal and contain extreme values. There may also be some serial dependency effects to account for. In order to adequately model these collective behaviors we recommend using ARIMA models with an ARCH/GARCH “time-varying” volatility input, where the latter has non-normal stable innovations. This approach is more flexible and accurate than the commonly used simple exponentially weighted moving average (EWMA) volatility model, and provides accurate time-varying estimates of VaR and expected tail loss (ETL) risk measures. See Sect. 4 for discussion of ETL vs. VaR that emphasizes the advantages of ETL. However, we stress that those who must use VaR to satisfy regulatory requirements will get much more accurate results with stable VaR than with normal VaR, as the following example vividly shows. Consider the following portfolio of Brady bonds: • • • • •
Brazil C 04/14 Brazil EIB 04/06 Venezuela DCB Floater 12/07 Samsung KRW Ord Shares Thai Farmers Bank THB
We have run normal, historical and stable 99% (1% tail probability) VaR calculations for one-year of daily data from January 9, 2001 to January 9, 2002. We used a moving window with 250 historical observations for the normal VaR model, 500 for the historical VaR model and 700 for the stable VaR model. For each of these cases we used a GARCH(1,1) model for volatility clustering of the risk factors, with stable innovations. We back-tested these VaR calculations by using the VaR values as one-step ahead predictors, and got the results shown in Fig. 8. The figure shows: the returns of the Brady bond portfolio (top curve); the normal+EWMA (a la RiskMetrics) VaR (curve with jumpy behavior, just below the returns); the historical VaR (the smoother curve mostly below but sometimes crossing the normal+EWMA VaR); the stable+GARCH VaR (the bottom curve). The results with regard to exceedances of the 99% VaR, and keeping in mind Basel II guidelines, may be summarized as follows: • •
Normal 99% VaR produced 12 exceedances (red zone) Historical 99% VaR produced 9 exceedances (on upper edge of yellow zone)
Stable ETL Optimal Portfolios and Extreme Risk Management
247
0.05 Observed Portf. Returns Stable 99% VaR Normal 99% VaR Historical 99% VaR
0.04 0.03 0.02 0.01 0 −0.01 −0.02 −0.03 −0.04 −0.05 0
50
100
150
200
250
300
Fig. 8. A VaR back-test example
•
Stable 99% VaR produced 1 exceedence and nearly two (well in the green zone)
Clearly stable (+GARCH) 99% VaR produces much better results with regard to Basel II compliance. This comes at the price of higher initial capital reserves, but results in a much safer level of capital reserves and a very clean bill of health with regard to compliance. Note that organizations in the red zone will have to increase their capital reserves by 33%, which at some times for some portfolios will result in larger capital reserves than when using the stable VaR, this in addition to being viewed as having inadequate risk measures relative to the organization using stable VaR.
4 ETL is the Next Generation Risk Measure 4.1 Why Not Value-at-Risk (VaR)? There is no doubt that VaR’s popularity is in large part due to its simplicity and its ease of calculation for 1–5% confidence levels. However, there is a price to be paid for the simplicity of VaR in the form of several limitations:
248
• •
• •
S.T. Rachev et al.
VaR does not give any indication of the risk beyond the quantile, and so provides very weak information on downside risk. VaR portfolio optimization is a non-convex, non-smooth problem with multiple local minima that can result in portfolio composition discontinuities. Furthermore it requires complex calculation techniques such as integer programming. VaR is not sub-additive; i.e. the VaR of the aggregated portfolio can be larger than the sum of the VaR’s of the sub-portfolios. Historical VaR limits the range of the scenarios to data values that have actually been observed, while normal Monte Carlo tends to seriously underestimate the probability of extreme returns. In either case, the probability functions beyond the sample range are either zero or excessively close to zero.
4.2 ETL and Stable versus Normal Distributions Expected Tail Loss (ETL) is simply the average (or expected value) loss conditioned on the loss being larger than VaR. ETL is also known as Conditional Value-at-Risk (CVaR), or Expected Shortfall (ES). (We assume that the underlying return distributions are absolutely continuous, and therefore, ETL is equal to CVaR). As such ETL is intuitively much more informative than VaR. We note however that ETL offers little benefit to investors who use a normal distribution to calculate VaR at the usual 99% confidence limit (1% tail probability). The reason is that the resulting VaR and ETL values differ by very little, specifically: •
For CI = 1%, VaR = 2.336 and ETL = 2.667
ETL really comes into its own when coupled with stable distribution models that capture leptokurtic tails (“fat tails”). In this case ETL and VaR values will be quite different, with the resulting ETL often being much larger than the VaR. As in the graph in Fig. 9, consider the time series of daily returns for the stock OXM from January 2000 to December 2001. Observe the occurrences of extreme values. While this series also displays obvious volatility clustering that deserves to be modeled as described in Sect. 4.3, we shall ignore this aspect for the moment. Rather, here we provide a compelling example of the difference between ETL and VaR based on a well-fitting stable distribution, as compared with a poor fitting normal distribution. Figure 10 shows a histogram of the OXM returns with a normal density fitted using the sample mean and sample standard deviation, and a stable density fitted using maximum-likelihood estimates of the stable distribution parameters. The stable density is shown by the solid line and the normal density is shown by the dashed line. The former is obviously a better fit than the latter, when using the histogram of the data values as a reference. The
Stable ETL Optimal Portfolios and Extreme Risk Management
249
−0.15
−0.10
−0.05
0.0
0.05
0.10
OXM Returns
0
100
200
300
400
500
Fig. 9. The daily returns of OXM
15
20
25
30
99% VAR FOR NORMAL AND STABLE DENSITIES
10
Normal VAR = .047
0
5
Stable VAR = .059
−0.2
−0.1
0.0
0.1
0.2
OXM RETURNS
Fig. 10. The stable and normal 99% VaR for OXM
estimated stable tail thickness index is α ˆ = 1.62. The 1% VaR values for the normal and stable fitted densities are 0.047 and 0.059 respectively, a ratio of 1.26 which reflects the heavier-tailed nature of the stable fit. Figure 11 displays the same histogram and fitted densities with 1% ETL values instead of the 1% VaR values. The 1% ETL values for the normal and stable fitted densities are 0.054 and 0.174 respectively, a ratio of a little over three-to-one. This larger ratio is due to the stable density’s heavy tail contribution to ETL relative to the normal density fit.
250
S.T. Rachev et al.
15
20
25
30
99% ETL FOR NORMAL AND STABLE DENSITIES
10
Normal ETL = .054
0
5
Stable ETL = .147
−0.2
−0.1
0.0
0.1
0.2
OXM RETURNS
Fig. 11. The stable and normal 99% ETL for OXM
Unlike VaR, ETL has a number of attractive properties: • • • •
ETL gives an informed view of losses beyond VaR. ETL is a convex, smooth function of portfolio weights, and is therefore attractive to optimize portfolios (see [7]). This point is vividly illustrated in the subsection below on ETL and Portfolio Optimization. ETL is sub-additive and satisfies a set of intuitively appealing coherent risk measure properties. ETL is a form of expected loss (i.e. a conditional expected loss) and is a very convenient form for use in scenario-based portfolio optimization. It is also quite a natural risk-adjustment to expected return (see STARR, or Stable Tail Adjusted Return Ratio).
The limitations of current normal risk factor models and the absence of regulator blessing have held back the widespread use of ETL, in spite of its highly attractive properties. However, we expect ETL to be a widely accepted risk measure as portfolio and risk managers become more familiar with its attractive properties. For portfolio optimization, we recommend the use of Stable distribution ETL (SETL), and limiting the use of historical, normal or stable VaR to required regulatory reporting purposes only. Finally, organizations should consider the advantages of Stable ETL for risk assessment purposes and nonregulatory reporting purposes.
Stable ETL Optimal Portfolios and Extreme Risk Management
251
4.3 Portfolio Optimization and ETL Versus VaR To the surprise of many, portfolio optimization with ETL turns out to be a smooth, convex problem with a unique solution [7]. These properties are in sharp contrast to the non-convex, rough VaR optimization problem. The contrast between VAR and ETL portfolio optimization surfaces is illustrated in Fig. 12 for a simple two-asset portfolio. The horizontal axes show one of the portfolio weights (from 0% to 100%) and the vertical axes display portfolio VAR and ETL respectively. The data consist of 200 simulated uncorrelated returns. The VAR objective function is quite rough with respect to varying the portfolio weight(s), while that of the ETL objective function is smooth and convex. One can see that optimizing with ETL is a much more tractable problem than optimizing with VaR. Rockafellar and Uryasev [7], show that the ETL optimal portfolio weight vector can be obtained based on historical (or scenario) returns data by minimizing a relatively simple convex function (Rockafellar and Uryasev used the term CVaR whereas we use the, less confusing, synonym ETL). Assuming p assets with single period returns ri = (ri1 , ri2 , · · · , rip ), for period i, and a portfolio weight vector w = (w1 , w2 , . . . , wp ), the function to be minimized is 1 + [w ri − γ] , ε · n i=1 n
F (w, γ) = γ +
VAR
0.22 0.24 0.26 0.28 0.30
NON-CONVEX ROUGH VAR SURFACE
0.0
0.2
0.4
0.6
0.8
1.0
0.8
1.0
WEIGHT
0.34 0.30 0.26
ETL
0.38
CONVEX SMOOTH ETL SURFACE
0.0
0.2
0.4
0.6
WEIGHT
Fig. 12. VaR and ETL surfaces as functions of portfolio weights
252
S.T. Rachev et al. +
where [x] denotes the positive part of x. This function is to be minimized jointly with respect to w and γ, where ε is the tail probability for which the expected tail loss is computed. Typicallyε = .05 or .01, but larger values may be useful, as we discuss in section 5.6. The authors further show that this optimization problem can be cast as a LP (linear programming) problem, solvable using any high-quality LP software. CognityTM combines this approach with fitting GMstable distribution models for scenario generation. The stable scenarios provide accurate and well-behaved estimates of ETL for the optimization problem. 4.4 Stable ETL Leads to Higher Risk Adjusted Returns ETL portfolio optimization based on GMstable distribution modeling, which we refer to as SETL portfolios, can lead to significant improvements in risk adjusted return as compared to the conventional Markowitz mean–variance portfolio optimization. Figures 13 and 14 are supplied to illustrate the claim that stable ETL optimal portfolios produce consistently better risk-adjusted returns. These figures show the risk adjusted return MU/VaR (mean return divided by VaR) and MU/ETL (mean return divided by ETL) for 1% VaR optimal portfolios and ETL optimal portfolios, and using a multi-period fixed-mix optimization in all cases. In this simple example, the portfolio to be optimized consists of two assets, cash and the S&P 500. The example is based on monthly data from February 1965 to December 1999. Since we assume full investment, the VaR optimal portfolio depends only on a single portfolio weight and the optimal weight(s) is found by a simple grid search on the interval 0 to 1. The use of a grid search technique, overcomes the problems with non-convex and non-smooth VaR optimization. In this example the optimizer is maximizing M U − c · V AR and M U − c · ET L, where c is the risk aversion (parameter), and with VaR or ETL as the penalty function. Figure 13 shows that even using the VaR optimal portfolio, one gets a significant relative gain in risk-adjusted return using stable scenarios when compared to normal scenarios, and with the relative gain increasing with increasing risk aversion. The reason for the latter behavior is that with stable distributions the optimization pays more attention to the S&P returns distribution tails, and allocates less investment to the S&P under stable distributions than under normal distributions as risk aversion increases. Figure 14 for the risk-adjusted return for the ETL optimal portfolio has the same vertical axis range as the previous plot for the VaR optimal portfolio. The figure below shows that the use of ETL results in much greater gain under the stable distribution relative to the normal than in the case of the VaR optimal portfolio. At every level of risk aversion, the investment in the S&P 500 is even less in the ETL optimal portfolio than in the case of the VaR optimal portfolio.
Stable ETL Optimal Portfolios and Extreme Risk Management
1
2
3
Stable VaR VaR
0
RISK ADJUSTED RETURN (MU / VAR)
4
STABLE VaR OPTIMAL PORTFOLIOS
0.020
0.022
0.024
0.026
0.028
0.030
RISK AVERSION Fig. 13. Risk aversion versus risk-adjusted return, VaR based
4 1
2
3
SETL VaR
0
RISK ADJUSTED RETURN (MU / ETL)
STABLE ETL OPTIMAL PORTFOLIOS
0.018
0.020
0.022
0.024
0.026
0.028
0.030
RISK AVERSION Fig. 14. Risk aversion versus risk-adjusted return, ETL based
253
254
S.T. Rachev et al.
This behavior is to be expected because the ETL approach pays attention to the losses beyond VaR (the expected value of the extreme loss), and which in the stable case are much greater than in the normal case.
5 The Stable ETL Paradigm 5.1 The Stable ETL Framework Our risk management and portfolio optimization framework uses multidimensional asset and risk factor returns models based on GMstable distributions, and stresses the use of Stable ETL (SETL) as the risk measure of choice. These stable distribution models incorporate generalized dependence structure with copulas, and include time varying volatilities based on GARCH models with stable innovations. Henceforth we use the term GMstable distribution to include the generalized dependence structure and volatility clustering model aspects of the model. Collectively, these modeling foundations form the basis of a new and powerful overall basis for investment decisions that we call the SETL Framework. Currently the SETL framework has the following basic components: • • • • •
SETL SETL SETL SETL SETL
scenario engines factor models integrated market risk and credit risk optimal portfolios and efficient frontiers derivative pricing
Going forward, additional classes of SETL investment decision models will be developed, such as SETL betas and SETL asset liability models. The rich structure of these models will encompass the heavy-tailed distributions of the asset returns, stochastic trends, heteroscedasticity, short-and long-range dependence, and more. We use the term “SETLg model” to describe any such model in order to keep in mind the importance of the stable tail-thickness parameter αg and skewness parameter β, along with volatility clustering and general dependence models, in financial investment decisions. It is essential to keep in mind the following SETL fundamental principles concerning risk factors: (P1) Asset and risk factor returns have stable distributions where each asset or risk factor typically has a different stable tail-index αi and skewness parameter βi . (P2) Asset and risk factor returns are associated through models that describe the dependence between the individual factors more accurately than classical correlations. Often these will be copula models. (P3) Asset and risk factor modeling typically includes a SETL econometric model in the form of multivariate ARIMA-GARCH processes
Stable ETL Optimal Portfolios and Extreme Risk Management
255
with residuals driven by fractional stable innovations. The SETL econometric model captures clustering and long-range dependence of the volatility. 5.2 Stable ETL Optimal Portfolios A SETL optimal portfolio is one that minimizes portfolio expected tail loss subject to a constraint of achieving expected portfolio returns at least as large as an investor defined level, along with other typical constraints on weights, where both quantities are evaluated in the SETL framework. Alternatively, a SETL optimal portfolio solves the dual problem of maximizing portfolio expected return subject to a constraint that portfolio expected tail loss is not greater than an investor defined level, where again both quantities are evaluated in the SETL framework. In order to define the above ETL precisely we use the following quantities: the random return of portfolio p Rp : the stable distribution expected return of portfolio p SERp : Lp = −Rp + SERp : the loss of portfolio p relative to its expected return ε: a tail probability of the SETL distribution Lp SV aRp (ε): the stable distribution Value-at-Risk for portfolio p The latter is defined by the equation Pr[Lp > SV aRp (ε)] = ε where the probability is calculated in the SETL framework, that is SV aRp (ε) is the ε-quantile of the stable distribution of Lp . In the value-at-risk literature (1 − ε) × 100% is called the confidence level. Here we prefer to use the simpler, unambiguous term tail probability. Now we define SETL of a portfolio p as SET Lp (ε) = E[Lp |Lp > SV aRp (ε) ] where the conditional expectation is also computed in the SETL framework. We use the “S” in SERp , SV aRp (ε) and SET Lp (ε) as a reminder that stable distributions are a key aspect of the framework (but not the only aspect!). Proponents of normal distribution VaR typically use tail probabilities of 0.01 or 0.05. When using SET Lp (ε) risk managers may wish to use other tail probabilities such as 0.1, 0.15, 0.20, 0.25, or 0.5. We note that use of different tail probabilities is similar in spirit to using different utility functions. The following assumptions are in force for the SETL investor: (A1) The universe of assets is Q (the set of mandate admissible portfolios) (A2) The investor may borrow or deposit at the risk-free rate rf without restriction (A3) The portfolio is optimized under a set of asset allocation constraints λ (A4) The investor seeks an expected return of at least µ
256
S.T. Rachev et al.
To simplify the notation we shall let A3 be implicit in the following discussion. At times we shall also suppress the ε when its value is taken as fixed and understood. The SETL investor’s optimal portfolio is ωα (µ|ε) = arg min SET Lq (ε) q∈Q
subject to SERq ≥ µ. Here we use ωα to mean either the resulting portfolio weights or the label for the portfolio itself, depending upon the context. The subscript α to remind us that we are using a GMstable distribution modeling approach (which entails different stable distribution parameters for each asset and risk factor). In other words the SETL optimum portfolio ωα minimizes the expected tail loss among all portfolios with mean return at least µ , for fixed tail probability ε and asset allocation constraints λ. Alternatively, the SETL optimum portfolio ωα solves the dual problem ωα (η|ε) = arg max SERq q∈Q
subject to SET Lq (ε) ≤ η. The SETL efficient frontier is given by ωα (µ|ε) as a function of µ for fixed ε, as indicated in Fig. 15. If the portfolio includes cash account with risk free rate rf , then the SETL efficient frontier will be the SETL capital market line (CM Lα ) that connects the risk-free rate on the vertical axis with the SETL tangency portfolio (Tα ), as indicated in the figure. We now have a SETL separation principal analogous to the classical separation principal: The tangency portfolio Tα can be computed without reference to the risk-return preferences of any investor. Then an investor chooses a portfolio along the SETL capital market line CM Lα according to his/her risk-return preference. CMLα
SER
SETL efficient frontier
Tα r1
SETL Fig. 15. The SETL efficient frontier and the capital market line
Stable ETL Optimal Portfolios and Extreme Risk Management
257
Keep in mind that in practice when a finite sample of returns one ends up with a SETL efficient frontier, tangency portfolio and capital market line that are estimates of true values for these quantities. 5.3 Markowitz Portfolios are Sub-Optimal While the SETL investor has optimal portfolios described above, the Markowitz investor is not aware of the SETL framework and constructs a mean-variance optimal portfolio. We assume that the Markowitz investor operates under the same assumptions A1-A4 as the SETL investor. Let ERq be the expected return and σq the standard deviation of the returns of a portfolio q. The Markowitz investor’s optimal portfolio is ω2 (µ) = min σq q∈Q
subject to ERq ≥ µ along with the other constraints λ. The Markowitz optimal portfolio can also be constructed by solving the obvious dual optimization problem. The subscript 2 is used in ω2 as a reminder that α = 2 you have the limiting Gaussian distribution member of the stable distribution family, and in that case the Markowitz portfolio is optimal. Alternatively you can think of the subscript 2 as a reminder that the Markowitz optimal portfolio is a second-order optimal portfolio, i.e., an optimal portfolio based on only first and second moments. The Markowitz investor ends up with a different portfolio, i.e., a different set of portfolio weights with different risk versus return characteristics, than the SETL investor. It is important to note that the performance of the Markowitz portfolio, like that of the SETL portfolio, is evaluated under a GMstable distributional model. If in fact the distribution of the returns were exactly multivariate normal (which they never are) then the SETL investor and the Markowitz investor would end up with one and the same optimal portfolio. However, when the returns are non-Gaussian SETL returns, the Markowitz portfolio is sub-optimal. This is because the SETL investor constructs his/her optimal portfolio using the correct distribution model, while the Markowitz investor does not. Thus the Markowitz investors frontier lies below and to the right of the SETL efficient frontier, as shown in Fig. 16, along with the Markowitz tangency portfolio T2 and Markowitz capital market line CM L2 . As an example of the performance improvement achievable with the SETL optimal portfolio approach, we computed the SETL efficient frontier and the Markowitz frontier for a portfolio of 47 micro-cap stocks with the smallest alphas from the random selection of 182 micro-caps in Sect. 3.1. The results
258
S.T. Rachev et al.
CMLα
SER
SETL efficient frontier CML2 Markowitz frontier Tα T2 rf
xe
SETL
Fig. 16. The SETL and the Markowitz efficient frontiers
10
20
30
40
50
SETL Markowitz
TAIL PROBABILITY = 1% 0
EXPECTED RETURN (Basis Points per Day)
60
RETURN VERSUS RISK OF MICRO-CAP PORTFOLIOS Daily Returns of 47 Micro-Caps 2000-2001
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
TAIL RISK
Fig. 17. SETL and Markowitz efficient portfolios, a micro-cap example
are displayed in Fig. 17. The results are based on 3,000 scenarios from the fitted GMstable distribution model based on two years of daily data during years 2000 and 2001. We note that, as is generally the case, each of the 47 stock returns has its own estimate stable tail index α ˆ i , i = 1, 2, . . . , 47. Here we have plotted values of T ailRisk = ε · SET L(ε), for ε = 0.01, as a natural decision theoretic risk measure, rather than SET L(ε) itself. We note that over a considerable range of tail risk the SETL efficient frontier dominates the Markowitz frontier by 14–20 bp’s daily!
Stable ETL Optimal Portfolios and Extreme Risk Management
259
We note that the 47 micro-caps with the smallest alphas used for this example have quite heavy tails as indicated by the box plot of their estimated alphas shown below. Here the median of the estimated alphas is 1.38, while the upper and lower quartiles are 1.43 and 1.28 respectively. Evidently there is a fair amount of information in the non-Gaussian tails of such micro-caps that can be exploited by the SETL approach. 5.4 From Sharpe to STARR-Performance and R-Performance Measures The Sharpe Ratio for a given portfolio p is defined as follows: SRp =
ERp − rf σp
(1)
where ER p is the portfolio expected return, σ p is the portfolio return standard deviation as a measure of portfolio risk, and rf is the risk-free rate. While the Sharpe ratio is the single most widely used portfolio performance measure, it has several disadvantages due to its use of the standard deviation as risk measure: • • •
σp is a symmetric measure that does not focus on downside risk σp is not a coherent measure of risk (see Artzner et al. 1999) σp has an infinite value for non-Gaussian stable distributions
Stable Tail Adjusted Return Ratio As an alternative performance measure that does not suffer these disadvantages, we propose the Stable Tail Adjusted Return Ratio (STARR) defined as: ST ARRp (ε) =
SERp − rf . SET Lp (ε)
(2)
Referring to the first figure in Sect. 5.3, one sees that a SETL optimal portfolio produces the maximum STARR under a SETL distribution model, and that this maximum STARR is just the slope of the SETL capital market line CM Lα . On the other hand the maximum STARR of a Markowitz portfolio is equal to the slope of the Markowitz capital market line CM L2 . The latter is always dominated by CM Lα , and is equal to CM Lα only in the case where the returns distribution is multivariate normal in which case α = 2 for all asset and risk factor returns. Referring to the second figure of Sect. 5.3, one sees that for relatively high risk-free rate of 5 bps per day, the STARR for the SETL portfolio dominates that of the Markowitz portfolio. Furthermore this dominance appears quite likely to persist if the efficient frontiers were calculated for lower risk and return positions and smaller risk-free rates were used.
260
S.T. Rachev et al.
We conclude that the risk adjusted return of the SETL optimal portfolio ωα is generally superior to the risk adjusted return of the Markowitz mean variance optimal portfolio ω2 . The SETL framework results in improved investment performance. Rachev Ratio The Rachev Ratio (R-ratio) is the ratio between the expected excess tailreturn at a given confidence level and the expected excess tail loss at another confidence level: ET Lγ1 (x (rf − r)) ρ(r) = ET Lγ2 (x (r − rf )) Here the levels γ1 and γ2 are in [0,1], x is the vector of asset allocations and r − rf is the vector of asset excess returns. Recall that if r is the portfolio return, and L = −r is the portfolio loss, we define the expected tail loss as ET Lα% (r) = E(L/L > V aRα% ), whereP (L > V aRα% ) = α, and α is in (0,1). The R-Ratio is a generalization of the STARR. Choosing appropriate levels γ1 and γ2 in optimizing the R-Ratio the investor can seek the best risk/return profile of her portfolio. For example, an investor with portfolio allocation maximizing the R-Ratio with γ1 = γ2 =0.01 is seeking exceptionally high returns and protection against high losses. 5.5 The Choice of Tail Probability We mentioned earlier that when using SET Lp (ε) rather than V aRp (ε), risk managers and portfolio optimizers may wish to use other values of ε than the conventional VaR values of .01 or .05, for example values such as 0.1, 0.15, 0.2, 0.25 and 0.5 may be of interest. The choice of a particular ε amounts to a choice of particular risk measure in the SETL family of measures, and such a choice is equivalent to the choice of a utility function. The tail probability parameter ε is at the asset manager’s disposal to choose according to his/her asset management and risk control objectives. Note that choosing a tail probability ε is not the same as choosing a risk aversion parameter. Maximizing SERp − c · SET Lp (ε) for various choices of risk aversion parameter c for a fixed value of ε merely corresponds to choosing different points along the SETL efficient frontier. On the other hand changing ε results in different shapes and locations of the SETL efficient frontier, and corresponding different SETL excess profits relative to a Markowitz portfolio. It is intuitively clear that increasing ε will decrease the degree to which a SETL optimal portfolio depends on extreme tail losses. In the limit of ε = 0.5, which may well be of interest to some managers since it uses the average
Stable ETL Optimal Portfolios and Extreme Risk Management
261
loss below zero of Lp as its penalty function, small to moderate losses are mixed in with extreme losses in determining the optimal portfolio. There is some concern that some of the excess profit advantage relative to Markowitz portfolios will be given up as ε increases. Our studies to date indicate, not surprisingly, that this effect is most noticeable for portfolios with smaller stable tail index values. It will be interesting to see going forward what values of ε will be used by fund managers of various types and styles. A generalization of the SETL efficient frontier is the R-efficient frontier, obtained by replacing the stable portfolio expected return SERp in SERp − c · SET Lp (ε) by the excess tail return , the numerator in the R- ratio. R-efficient frontier allows for fine tuning of the tradeoff between high excess means returns and protection against large loss. 5.6 The Cognity Implementation of the SETL Framework The SETL framework described in this paper has been implemented in the CognityTM Risk Management and Portfolio Optimization product. This product contains solution modules for Market Risk, Credit Risk (with integrated Market and Credit Risk), Portfolio Optimization, and Fund-of-Funds portfolio management, with integrated factor models. CognityTM is implemented in a modern Java based server architecture to support both desktop and Web delivery. For further details see www.finanalytica.com. Acknowledgements The authors gratefully acknowledge the extensive help provided by Stephen Elston and Frederic Siboulet in the preparation of this paper. The authors owe a special debt to Paul Wilmott for extensive suggestions on an earlier version of our work, great understanding and encouragement.
References [1] Bradley, B. O. and Taqqu, M. S. (2003). “Financial Risk and Heavy Tails”, in Handbook of Heavy Tailed Distributions in Finance, edited by S. T. Rachev, Elsevier/North-Holland, Amsterdam [2] Fama, E. (1963). “Mandelbrot and the Stable Paretian Hypothesis”, Journal of Business, 36, 420–429 [3] Mandelbrot, B. B. (1963). “The Variation in Certain Speculative Prices”, Journal of Business, 36, 394–419 [4] Rachev, S. and Mittnik, S. (2000). Stable Paretian Models in Finance. Wiley, New York
262
S.T. Rachev et al.
[5] Racheva-Iotova B., Stoyanov, S., and Rachev S. (2003). Stable NonGaussian Credit Risk Model; The Cognity Approach, in Credit Risk (Measurement, Evaluations and Management), edited by G. Bol, G. Nakhaheizadeh, S. Rachev, T. Rieder, K-H. Vollmer, Physica-Verlag Series: Contributions to Economics, Springer, Heidelberg, NY, 179–198 [6] Rachev, S., Menn, C., and Fabozzi, F.J. (2005). Fat Tailed and Skewed Asset Return Distributions: Implications for Risk, Wiley-Finance, Hoboken [7] Rockafellar, R. T. and Uryasev, S. (2000). “Optimization of Conditional Value-at-Risk”, Journal of Risk, 3, 21–41 [8] Sklar, A. (1996). “Random Variables, Distribution Functions, and Copulas – a Personal Look Backward and Forward”, in Distributions with Fixed Marginals and Related Topics, edited by Ruschendorff et. al., Institute of Mathematical Sciences, Hayward, CA [9] Stoyanov, S., Racheva-Iotova, B. (2004) “Univariate Stable Laws in the Field of Finance-Parameter Estimation, Journal of Concrete and Applied Mathematics, 2, 369–396
Pricing Tranches of a CDO and a CDS Index: Recent Advances and Future Research Dezhong Wang1 , Svetlozar T. Rachev2 , and Frank J. Fabozzi3 1
2
3
Department of Applied Probability and Statistics, University of California, Santa Barbara CA, USA, [email protected] Department of Econometrics, Statistics and Mathematical Finance, University of Karlsruhe, Germany and Department of Applied Probability and Statistics, University of California, Santa Barbara CA, USA [email protected] Yale School of Management, New Haven CT, USA, [email protected]
1 Introduction In recent years, the market for credit derivatives has developed rapidly with the introduction of new contracts and the standardization of trade documentation. These include credit default swaps, basket default swaps, credit default swap indexes, collateralized debt obligations, and credit default swap index tranches. Along with the introduction of new products comes the issue of how to price them. For single-name credit default swaps, there are several factor models (one-factor and two-factor models) proposed in the literature. However, for credit portfolios, much work has to be done in formulating models that fit market data. The difficulty in modeling lies in estimating the correlation risk for a portfolio of credits. In an April 16, 2004 article in the Financial Times [5], Darrell Duffie made the following comment on modeling portfolio credit risk: “Banks, insurance companies and other financial institutions managing portfolios of credit risk need an integrated model, one that reflects correlations in default and changes in market spreads. Yet no such model exists.” Almost a year later, a March 2005 publication by the Bank for International Settlements noted that while a few models have been proposed, the modeling of these correlations is “complex and not yet fully developed.” [1]. In this paper, first we review three methodologies for pricing CDO tranches. They are the one-factor copula model, the structural model, and the loss process model. Then we propose how the models can be improved. The paper is structured as follows. In the next section we review credit default swaps and in Sect. 3 we review collateralized debt obligations and credit default swap index tranches. The three pricing models are reviewed in Sects. 4 (one-factor copula model), 5 (structural model), and 6 (loss process model). Our proposed models are provided in Sect. 7 and a summary is provided in the final section, Sect. 8.
264
D. Wang et al.
2 Overview of Credit Default Swaps The major risk-transferring instrument developed in the past few years has been the credit default swap. This derivative contract permits market participants to transfer credit risk for individual credits and credit portfolios. Credit default swaps are classified as follows: single-name swaps, basket swaps, and credit default index swaps. 2.1 Single-Name Credit Default Swap A single-name credit default swap (CDS) involves two parties: a protection seller and a protection buyer. The protection buyer pays the protection seller a swap premium on a specified amount of face value of bonds (the notional principal) for an individual company (reference entity/reference credit). In return the protection seller pays the protection buyer an amount to compensate for the loss of the protection buyer upon the occurrence of a credit event with respect to the underlying reference entity. In the documentation of a CDS contract, a credit event is defined. The list of credit events in a CDS contract may include one or more of the following: bankruptcy or insolvency of the reference entity, failure to pay an amount above a specified threshold over a specified period, and financial or debt restructuring. The swap premium is paid on a series of dates, usually quarterly in arrears based on the actual/360 day count convention. In the absence of a credit event, the protection buyer will make a quarterly swap premium payment until the expiration of a CDS contract. If a credit event occurs, two things happen. First, the protection buyer pays the accrued premium from the last payment date to the time of the credit event to the seller (on a days fraction basis). After that payment, there are no further payments of the swap premium by the protection buyer to the protection seller. Second, the protection seller makes a payment to the protection buyer. There can be either cash settlement or physical settlement. In cash settlement, the protection seller pays the protection buyer an amount of cash equal to the difference between the notional principal and the present value of an amount of bonds, whose face value equals the notional principal, after a credit event. In physical settlement, the protection seller pays the protection buyer the notional principal, and the protection buyer delivers to the protection seller bonds whose face value equals the notional principal. At the time of this writing, the market practice is physical settlement. 2.2 Basket Default Swap A basket default swap is a credit derivative on a portfolio of reference entities. The simplest basket default swaps are first-to-default swaps, second-to-default swaps, and nth-to-default swaps. With respect to a basket of reference entities,
Pricing Tranches of a CDO and a CDS Index
265
a first-to-default swap provides insurance for only the first default, a second-todefault swap provides insurance for only the second default, an nth-to-default swap provides insurance for only the nth default. For example, in an nth-todefault swap, the protection seller does not make a payment to the protection buyer for the first n − 1 defaulted reference entities, and makes a payment for the nth defaulted reference entity. Once there is a payment upon the default of the nth defaulted reference entity, the swap terminates. Unlike a singlename CDS, the preferred settlement method for a basket default swap is cash settlement. 2.3 Credit Default Swap Index A credit default swap index (denoted by CDX) contract provides protection against the credit risk of a standardized basket of reference entities. The mechanics of a CDX are slightly different from that of a single-name CDS. If a credit event occurs, the swap premium payment ceases in the case of a singlename CDS. In contrast, for a CDX the swap premium payment continues to be made by the protection buyer but based on a reduced notional amount since less reference entities are being protected. As of this writing, settlement for a CDX is physical settlement.4 Currently, there are two families of standardized indexes: the Dow Jones CDX5 and the International Index Company iTraxx.6 The former includes reference entities in North America and emerging markets, while the latter includes reference entities in markets in Europe and Asia. Both families of indexes are standardized in terms of the index composition procedure, premium payment, and maturity. The two most actively traded indexes are the Dow Jones CDX NA IG index and the iTraxx Europe index. The former includes 125 North American investment-grade companies. The latter includes 125 European investmentgrade companies. For both indexes, each company is equally weighted. Also for these two indexes, CDX contracts with 3-, 5-, 7- and 10-year maturities are available. The composition of reference entities included in a CDX are renewed every six months based on a vote of participating dealers. The start date of a new version index is referred to as the roll date. The roll date is March 20 and September 20 of a calender year or the following business days if these days are not business days. A new version index will be “on-the-run” for the next six 4
5 6
The market is considering moving to cash settlement because of the cost of delivering an odd lot in the case of a credit event for a reference entity. For example, if the notional amount of a contract is $20 million and a credit event occurs, the protection buyer would have to deliver to the protection seller bonds of the reference entity with a face value of $160,000. Neither the protection buyer nor the protection seller likes to deal with such a small position. www.djindexes.com/mdsidx/?index=cdx. www.indexco.com.
266
D. Wang et al.
months. The composition of each version of a CDX remains static in its lifetime if no default occurs to the underlying reference entities, and the defaulted reference entities are eliminated from the index. There are two kinds of contracts on CDXs: unfunded and funded. An unfunded contract is a CDS on a portfolio of names. This kind of contract is traded on all the Dow Jones CDX and the iTraxx indexes. For some CDXs such as the Dow Jones CDX NA HY index and its sub-indexes7 and the iTraxx Europe index, the funded contract is traded. A funded contract is a credit-linked note (CLN), allowing investors who because of client imposed or regulatory restrictions are not permitted to invest in derivatives to gain risk exposure to the CDX market. The funded contract works like a corporate bond with some slight differences. A corporate bond ceases when a default occurs to the reference entity. If a default occurs to a reference entity in an index, the reference entity is removed from the index (and also from the funded contract). The funded contract continues with a reduced notional principal for the surviving reference entities in the index. Unlike the unfunded contract which uses physical settlement, the settlement method for the funded contract is cash settlement. The index swap premium of a new version index is determined before the roll day and unchanged over its life time, which is referred to as the coupon or the deal spread. The price difference between the prevailing market spread and the deal spread is paid upfront. If the prevailing market spread is higher than the deal spread, the protection buyer pays the price difference to the protection seller. If the prevailing market spread is less than the deal spread, the protection seller pays the price difference to the protection buyer. The index premium payments are standardized quarterly in arrears on the 20th of March, June, September, and December of each calendar year. The CDXs have many attractive properties for investors. Compared with the single-name swaps, the CDXs have the advantages of diversification and efficiency. Compared with basket default swaps and collateralized debt obligations, the CDXs have the advantages of standardization and transparency. The CDXs are traded more actively than the single-name CDSs, with low bid–ask spreads.
3 CDOs and CDS Index Tranches Based on the technology of basket default swaps, the layer protection technology is developed for protecting portfolio credit risk. Basket default swaps provide the protection to a single default in a portfolio of reference entities, 7
The Dow Jones CDX NA HY index includes 100 equal-weighted North America High Yield reference entities. Its sub-indexes include the CDX NA HY B (B-rated), CDX NA HY BB (BB-rated), and CDX NA HY HB (High Beta) indexes.
Pricing Tranches of a CDO and a CDS Index
267
for example, the first default, the second default, and the nth default. Correspondingly, there are the first layer protection, the second layer protection, and the nth layer protection. These protection layers work like basket default swaps with some differences. The main difference is that the n basket default swap protects the nth default in a portfolio and the nth protection layer protects the nth layer of the principal of a portfolio, which is specified by a range of percentage, for example 15–20%. The layer protection derivative products include collateralized debt obligations and CDS index tranches. 3.1 Collateralized Debt Obligation A collateralized debt obligation (CDO) is a security backed by a diversified pool of one or more kinds of debt obligations such as bonds, loans, credit default swaps or structured products (mortgage-backed securities, asset-backed securities, and even other CDOs). A CDO is initiated by a sponsor which can be banks, nonbank financial institutions, and asset management companies. The sponsor of a CDO creates an entity called a special purpose vehicle (SPV). The SPV works as an independent entity. In this way, CDO investors are isolated from the credit risk of the sponsor. Moreover, the SPV is responsible for the administration. The SPV obtains the credit risk exposure by purchasing debt obligations (bonds or residential and commercial loans) or selling CDSs; it transfers the credit risk by issuing debt obligations (tranches/credit-linked notes). The investors in the tranches of a CDO have the ultimate credit risk exposure to the underlying reference entities. Figure 1 shows the basic structure of a CDO backed by a portfolio of bonds. The SPV issues four kinds of CLNs referred to as tranches. Each tranche has an attachment percentage and a detachment percentage. When the cumulative percentage loss of the portfolio of bonds reaches the attachment percentage, investors in the tranche start to lose their principal, and when the cumulative
Tranche 4 30 − 70%
Bond 1 Bond 2 Bond 3 ... ... Bond n
Spreads Principal
Coupons Principal
Tranche 3 15 − 30%
SPV
Proceeding
Proceeding
Tranche 2 5 − 15%
Tranche 1 0 − 5% Collateral Pool
Fig. 1. Structure of collaterized debt obligation
268
D. Wang et al.
percentage loss of principal reaches the detachment percentage, the investors in the tranche lose all their principal and no further loss can occur to them. For example, in Fig. 1 the second tranche has an attachment percentage of 5% and a detachment percentage of 15%. The tranche will be used to cover the cumulative loss during the life of a CDO in excess of 5% (its attachment percentage) and up to a maximum of 15% (its detachment percentage). In the literature, tranches of a CDO are classified as subordinate/equity tranche, mezzanine tranches, and senior tranches according to their subordinate levels (see [12]). For example, in Fig. 1 tranche 1 is an equity tranche, tranches 2 and 3 are mezzanine tranches, and tranche 4 is a senior tranche. Because the equity tranche is extremely risky, the sponsor of a CDO is one of the holders of the equity tranche and the SPV sells other tranches to investors. If the SPV of a CDO actually owns the underlying debt obligations, the CDO is referred to as a cash CDO. Cash CDOs can be classified as collateralized bond obligations (CBO) and collateralized loan obligation (CLO). The former have only bonds in their pool of debt obligations, and the latter have only commercial loans in their pool of debt obligations. If the SPV of a CDO does not own the debt obligations, instead obtaining the credit risk exposure by selling CDSs on the debt obligations of reference entities, the CDO is referred to as a synthetic CDO. Based on the motivation of sponsors, CDOs can be classified as balance sheet CDOs and arbitrage CDOs. The motivation for balance sheet CDOs (primarily CLO) is to transfer the risk of loans in a sponsoring bank’s portfolio in order to reduce regulatory capital requirements. The motivation for arbitrage CDOs is to arbitrage the interest difference between the underlying pool of debt obligations and CDO tranches. 3.2 CDS Index Tranches With the innovation of CDXs, the synthetic CDO technology is applied to slice CDXs, and standardized tranches with different subordinate levels are created to satisfy investors with different risk appetites. The tranches of an index provide the layer protections to the underlying portfolio of the index in the same way as the tranches of a CDO provide the layer protections to the underlying portfolio of the CDO as explained earlier. Both of the most actively traded indexes – the Dow Jones CDX NA IG and the iTraxx Europe – are sliced into five tranches: equity tranche, junior mezzanine tranche, senior mezzanine tranche, junior senior tranche, and super senior tranche. The standard tranche structure of the Dow Jones CDX NA IG is 0–3%, 3–7%, 7–10%, 10–15%, and 15–30%. The standard tranche structure of the iTraxx Europe is 0–3%, 3–6%, 6–9%, 9–12%, and 12–22%. Table 1 shows the index and tranches market quotes for the CDX NA IG and the iTraxx Europe on August 4, 2004. For both indexes, the swap premium of the equity tranche is paid differently from the non-equity tranches. It includes two parts: (1) the upfront percentage payment and (2) the fixed 500
Pricing Tranches of a CDO and a CDS Index
269
Table 1. CDS index and tranche market quotes – August 4, 2004 iTraxx Europe (5 year) Index 42
0–3% 27.6%
3–6% 168
6–9% 70
9–12% 43
12–22% 20
10–15% 47.5
15–30% 14.5
CDX NA IG (5 year) Index 63.25
0–3% 48.1%
3–7% 347
7–10% 135.5
Data are collected by GFI Group Inc. and used in [6]
basis points premium per annum. The market quote is the upfront percentage payment. For example, the market quote of 27.8% for the iTraxx equity tranche means that the protection buyer pays the protection seller 27.8% of the notional principal upfront. In addition to the upfront payment, the protection buyer also pays the protection seller the fixed 500 basis points premium per annum on the outstanding notional principal. For all the non-equity tranches, the market quotes are the premium in basis points, paid quarterly in arrears. Just like the indexes, the premium payments for the tranches (with the exception of the upfront percentage payment of the equity tranche) are made on the 20th of March, June, September, and December of each calendar year. Following the commonly accepted definition for a synthetic CDO, CDX tranches are not part of a synthetic CDO because they are not backed by a portfolio of bonds or CDSs [6]. In addition, CDX tranches are unfunded and they are insurance contracts, while synthetic CDO tranches are funded and they are CLNs. However, the net cash flows of index tranches are the same as synthetic CDO tranches and these tranches can be priced the same way as a synthetic CDO.
4 One-Factor Copula Model The critical input for pricing synthetic CDO and CDS index tranches is an estimate of the default dependence (default correlation) between the underlying assets. One popular method for estimating the dependence structure is using copula functions, a method first applied in actuarial science. While there are several types of copula function models, Li [10, 11] introduces the one-factor Gaussian copula model for the case of two companies and Laurent and Gregory [9] extend the model to the case of N companies. Several extensions to the one-factor Gaussian copula model were subsequently introduced into the literature. In this section, we provide a general description of the one-factor copula function, introduce the market standard model, and review both the
270
D. Wang et al.
one-factor double t copula model [6] and the one-factor normal inverse Gaussian copula model [8]. Suppose that a CDO includes n assets i = 1, 2, . . . , n and the default time τi of the ith asset follows a Poisson process with a parameter λi . The λi is the default intensity of the ith asset. Then the probability of a default occurring before time t is (1) P (τi < t) = 1 − exp(−λi t). In a one-factor copula model, it is assumed that the default time τi for the ith company is related to a random variable Xi with a zero mean and a unit variance. For any given time t, there is a corresponding value x such that P (Xi < x) = P (τi < t),
i = 1, 2, . . . , n.
(2)
Moreover, the one-factor copula model assumes that each random variable Xi is the sum of two components 4 i = 1, 2, . . . , n, (3) Xi = ai M + 1 − a2i Zi , where Zi is the idiosyncratic component of company i, and M is the common component of the market. It is assumed that the M and Zi ’s are mutually independent random variables. For simplicity, it is also assumed that the random variables M and Zi ’s are identical. The factor ai satisfies −1 ≤ ai ≤ 1. The default correlation between Xi and Xj is ai aj , (i = j). Let F denote the cumulative distribution of the Zi ’s and G denote the cumulative distribution of the Xi ’s. Then given the market condition M = m, we have ! x − ai m P (Xi < x|M = m) = F , (4) 1 − a2i and the conditional default probability is % ? G−1 [P (τi < t)] − ai m . P (τi < t|M = m) = F 1 − a2i
(5)
For simplicity, the following two assumptions are made: • •
All the companies have the same default intensity, i.e, λi = λ. The pairwise default correlations are the same, i.e, in (3), ai = a.
The second assumption means that the contribution of the market component is the same for all the companies and the correlation between any two companies is constant, β = a2 . Under these assumptions, given the market situation M = m, all the companies have the same cumulative risk-neutral default probability Dt|m . Moreover, for a given value of the market component M , the defaults are mutually independent for all the underlying companies. Letting Nt|m be the
Pricing Tranches of a CDO and a CDS Index
271
total defaults that have occurred by time t conditional on the market condition M = m, then Nt|m follows a binomial distribution Bin(n, Dt|m ), and P (Nt|m = j) =
n! Dj (1 − Dt|m )n−j , j!(n − j)! t|m
j = 0, 1, 2, . . . , n.
The probability that there will be exactly j defaults by time t is ∞ M P (Nt|m = j)fM (m)dm, P (Nt = j) = E P (Nt|m ) =
(6)
(7)
−∞
where fM (m) is the probability density function (pdf) of the random variable M . 4.1 Market Standard Model Li [10, 11] was the first to suggest that the Gaussian copula can be employed in credit risk modeling to estimate the default correlation. In a one-factor Gaussian copula model, the distributions of the common market component M and the individual component Zi ’s in (3) are standard normal Gaussian distributions. Because the sum of two independent Gaussian distributions is still a Gaussian distribution, the Xi ’s in (3) have a closed form. It can be verified that the Xi ’s have a standard normal distribution. The one-factor copula Gaussian copula model is the market standard model when implemented under the following assumptions: • • • •
A fixed recovery rate of 40% The same CDS spreads for all of the underlying reference entities The same pairwise correlations The same default intensities for all the underlying reference entities
The market standard model does not appear to fit market data well (see [6, 8]. In practice, market practitioners use implied correlations and base correlations. The implied correlation for a CDO tranche is the correlation that makes the value of a contract on the CDO tranche zero when pricing the CDO with the market standard model. For a CDO tranche, when inputting its implied correlation into the market standard model, the simulated price of the tranche should be its market price. McGinty, Beinstein, Ahluwalia, and Watts [14] introduced base correlations in CDO pricing. To understand base correlations, let’s use an example. Recalling the CDX NA IG tranches 0–3%, 3–7%,7–10%, 10–15%, and 15–30%, and assuming there exists a sequence of equity tranches 0–3%, 0–7%, 0–10%, 0–15%, and 0–30%, the premium payment on an equity tranche is a combination of the premium payment of the CDX NA IG tranches that are included in the corresponding equity tranche. For example, the equity tranche 0–10% includes three CDX NA IG tranches: 0–3%, 3–7%, and 7–10%. The premium
272
D. Wang et al.
payment on the equity tranche 0–10% includes three parts. The part of 0–3% is paid the same way as the CDX NA IG tranche 0–3%, the part of 3–7% is paid the same way as the CDX NA IG tranche 3–7%, and the part of 7–10% is paid the same way as the CDX NA IG tranche 7–10%. Then the definition of base correlation is the correlation input that make the prices of the contracts on these series of equity tranches zero. For example, the base correlation for the CDX NA IG tranche 7–10% is the implied correlation that makes the price of a contract on the equity tranche 0-10% zero. 4.2 One-Factor Double t Copula Model The natural extension to a one-factor Gaussian copula model uses heavy-tailed distributions. Hull and White [6] propose a one-factor double t copula model. In the model, the common market component M and the individual components Zi in (3) are assumed to have a normalized Student’s t distribution M = (nM − 2)/nM TnM , TnM ∼ T (nM ), (8) Zi = (ni − 2)/ni Tni , Tni ∼ T (ni ), where Tn is a Student’s t distribution with degrees of freedom n = 3, 4, 5, . . . . In the model, the distributions of Xi ’s do not have a closed form but instead must be calculated numerically. Hull and White [6] find that the one-factor double t copula model fits market prices well when using the Student’s t distribution with 4 degrees of freedom for M and Zi ’s. 4.3 One-Factor Normal Inverse Gaussian Copula Model Kalemanova, Schmid, and Werner [8] propose utilizing normal inverse Gaussian distributions in a one-factor copula model. A normal inverse Gaussian distribution is a mixture of normal and inverse Gaussian distributions. An inverse Gaussian distribution has the following density function % (ζ−ηx)2 √ ζ x−3/2 exp(− x > 0, 2ηx ), if 2πη fIG (x; ζ, η) = (9) 0, if x ≤ 0, where ζ > 0 and η > 0 are two parameters. We denote the inverse Gaussian distribution as IG(ζ, η). Suppose Y is an inverse Gaussian distribution. A normal Gaussian distribution X ∼ N (υ, σ 2 ) is a normal inverse Gaussian (NIG) distribution when its mean υ and variance σ 2 are random variables as given below υ = µ + βY, σ 2 = Y, Y ∼ IG(δγ, γ 2 ),
(10)
Pricing Tranches of a CDO and a CDS Index
273
where δ > 0, 0 ≤ |β| < α, and γ := α2 − β 2 . The distribution of the random variable X is denoted by X ∼ (α, β, µ, δ). The density of X is f (x; α, β, µ, δ) =
δα exp(δγ + β(x − u)) K(α δ 2 + (x − µ)2 ), π δ 2 + (x − µ)2
(11)
where K(.) is the modified Bessel function of the third kind as defined below 1 ∞ 1 K(ω) := exp(− ω(t − t−1 ))dt. (12) 2 0 2 The mean and variance of the NIG distribution X are respectively E(X) = µ +
δβ , γ
V ar(X) =
δα2 . γ3
(13)
The family of NIG distributions has two main properties. One is the closure under the scale transition α β X ∼ N IG(α, β, µ, δ) ⇒ cX ∼ N IG( , , cµ, cδ). (14) c c The other is that if two independent NIG random variables X and Y have the same α and β parameters, then the sum of these two variables is still an NIG variable as shown below X ∼ N IG(α, β, µ1 , δ1 ), Y ∼ N IG(α, β, µ2 , δ) ⇒ X + Y ∼ N IG(α, β, µ1 + µ2 δ1 + δ2 ).
(15)
When using NIG distributions in a one-factor copula model, the model is referred to as a one-factor normal inverse Gaussian copula model. The distributions for M and Zi ’s in (3) are given below ! αβ M ∼ N IG α, β, − ,α , α2 − β 2 ! (16) α 1 − a2i β 1 − a2i αβ 1 − a2i α 1 − a2i , ,− , . Zi ∼ N IG ai ai ai ai α2 − β 2 The distributions of Xi ’s in (3) are Xi ∼ N IG
! αβ 1 − a2i α α β , ,− , . ai ai ai α2 − β 2 ai
(17)
The selection of the parameters makes the variables Xi ’s, M , and Zi ’s have a zero mean and a unit variance when β = 0. The one-factor normal inverse Gaussian copula model fits market data a little bit better than the one-factor double t copula model. The advantage of the one-factor normal inverse Gaussian copula model is that the Xi ’s in the model have a closed form. This reduces the computing time significantly compared with that of the one-factor double t copula model. The former is about five times faster than the latter.
274
D. Wang et al.
5 Structural Model Hull, Predescu, and White [7] propose the structural model to price the default correlation in tranches of a CDO or an index. The idea is based on Merton’s model [17] and its extension by Black and Cox [3]. It is assumed that the value of a company follows a stochastic process, and if the value of the company goes below a minimum value (barrier), the company defaults. In the model, N different companies are assumed and the value of company i (1 ≤ i ≤ N ) at time t is denoted by Vi . The value of the company follows a stochastic process as shown below dVi = µi Vi dt + σi Vi dXi ,
(18)
where µi is the expected growth rate of the value of company i, σi is the volatility of the value of company i, and Xi (t) is a variable following a continuoustime Gaussian stochastic process (Wiener process). The barrier for company i is denoted by Bi . Whenever the value of company i goes below the barrier Bi , it defaults. Without the loss of generality, it is assumed that Xi (0) = 0. Applying Ito’s formula to ln Vi , it is easy to show that Xi (t) =
ln Vi (t) − ln Vi (0) − (µi − σi2 /2) . σi
(19)
Corresponding to Bi , there is a barrier Bi∗ for the variable Xi as given below Bi∗ =
ln Bi − ln Vi (0) − (µi − σi2 /2)t . σi
(20)
When Xi falls below Bi∗ , company i defaults. Denote βi =
ln Hi − ln Vi (0) σi
γi = −
µi − σi2 /2 , σi
(21)
then Bi∗ = βi + γi t. To model the default correlation, it is assumed that each Wiener process Xi follows a two-component process which includes a common Wiener process M and an idiosyncratic Wiener process Zi . It is expressed as 4 (22) dXi (t) = ai (t)dM (t) + 1 − a2i (t)dZi (t), where the variable ai , 1 ≤ ai ≤ 1 is used to control the weight of the twocomponent process. The Wiener processes M and Zi ’s are uncorrelated with each other. In this model, the default correlation between two companies i and j is ai aj . The model can be implemented by Monte Carlo simulation. Hull, Predescu, and White [7] implement the model in three different ways:
Pricing Tranches of a CDO and a CDS Index
• • •
275
Base case. Constant correlation and constant recovery rate. Stochastic Corr. Stochastic correlation and constant recovery rate. Stochastic RR. Stochastic correlation and stochastic recovery rate.
Two comparisons between the base-case structural model and the onefactor Gaussian copula model are provided. One is to calculate the joint default probabilities of two companies by both models. The other is to simulate the iTraxx Europe index tranche market quote by both models. In both cases, the results of these models are very close when the same default time correlations are input, while the one-factor Gaussian copula is a good approximation to the base-case structural model, the structural model has two advantages: it is a dynamic model and it has a clear economic rationale.
6 Loss Process Model Loss process models for pricing correlation risk have been developed by Sch¨ onbucher [20], Sidenius et al. [21], Di Graziano and Rogers [4], and Bennani [2]. Here we introduce the basic idea of the loss process model as discussed by Sch¨ onbucher. We omit the mathematical details. 6.1 Model Setup The model is set up in the probability space (Ω, (Ft )0≤t≤T , Q), where Q is a spot martingale measure, (Ft )0≤t≤T is the filtration satisfying the common definitions, and Ω is the sample space. Assume that there are N company names in a portfolio. Each name has the same notional principal in the portfolio. Under the assumption of a homogenous recovery rate for all the companies, all companies have identical losses in default which is normalized to one. The cumulative default loss process is defined by L(t) =
N
1{τk ≤t} ,
(23)
k
where τk is the default time of company k, and the default indicator 1{τk ≤t} is 1 when τk ≤ t and 0 when τk > t. The loss process is an N -bounded, integer-valued, non-decreasing Markov chain. Under Q-measure, the probability distribution of L(T ) at time t < T is denoted by the vector p(t, T ) := (p0 (t, T ), . . . , pN (t, T )) , where the pi ’s are conditional probabilities pi (t, T ) := P [L(T ) = i|Ft ],
i = 0, 2, . . . , N, t ≤ T.
(24)
The conditional probability pi (t, T ) is the implied probability of L(T ) = i, T ≥ t given the information up to time t. p(t, .) is referred to as the loss distribution at time t.
276
D. Wang et al.
6.2 Static Loss Process To price a CDO, it is necessary to determine an implied initial loss distribution p(0, T ). The implied initial loss distribution can be found by solving the evolution of the loss process L(t). As the loss process L(t) is an inhomogeneous Markov chain in a finite state space with N + 1 states {0, 1, 2, . . . , N }, its transition probabilities are uniquely determined by its generator matrix. Assuming that there is only one-step transition at any given time t, the generator matrix of the loss process has the following form ⎞ ⎛ −λ0 (t) λ0 (t) 0 ... 0 0 ⎟ ⎜ 0 0 0 −λ1 (t) λ1 (t) . . . ⎟ ⎜ ⎟ ⎜ .. . . . . .. .. . . . .. .. (25) A(t) = ⎜ . ⎟, ⎟ ⎜ ⎝ 0 0 0 . . . −λN −1 (t) λN −1 (t) ⎠ 0 0 0 ... 0 0 where the λi (t) s are the transition rates i = 0, 1, . . . , N − 1. The state N is an absorbing state. The probability transition matrix, defined by Pij (t, T ) := P [L(T ) = j|L(t) = i], satisfies the following Kolmogorov equations d P (t, T ) = −λ (T )P (t, T ), 0 i,0 dT i,0 d P (t, T ) = −λ (T )P (t, T ) + λ (T )P j i,j j−1 i,j−1 (t, T ), dT i,j d P (t, T ) = −λ N −1 (T )Pi,N −1 (t, T ), dT i,N
(26)
for all i, j = 0, 1, . . . , N and 0 ≤ t ≤ T . The initial conditions are Pi,j (t, t) = 1{i=j} . The solution of the Kolmogorov equations in (26) is given below ⎧ for i > j, ⎨0 0T for i = j, Pi,j (t, T ) = exp{− t λi (t, s)ds} 0 ⎩0T − tT λj (t,u)du Pi,j−1 (t, s)λj−1 e ds for i < j. t
(27)
The representation of the implied loss distribution at time t is simply pi (t, T ) = P [L(T ) = i|Ft ] = PL(t),i (t, T ).
(28)
For example, if L(t) = k, then the implied loss distribution at time t is pi (t, T ) = Pk,i (t, T ).
(29)
6.3 Dynamic Loss Process In the dynamics version of the loss process model, the loss process follows a Poisson process with time- and state-dependent inhomogeneous default intensities λL(t) (t), L(t) = 0, . . . , N − 1, which are the transition rates in the
Pricing Tranches of a CDO and a CDS Index
277
generator matrix in (25). The aggregate default intensity λL(t) (t) can be expressed in terms of the individual intensities λk (t) λL(t) (t) = λk (t), (30) k∈S(t)
where S(t) := {1 ≤ k ≤ N |τk > t} is the set of companies that have not defaulted by time t. The loss process is assumed to follow a Poisson process with stochastic intensity, a process referred to as a Cox process. dλi (t, T ) = µi (t, T )dT + σi (t, T )dB(t),
i = 0, . . . , N − 1,
(31)
where B(t) is a d-dimension Q-Brownian motion, the µi (t, T )’s are the drifts of the stochastic processes, and the σi (t, T )’s are the d-dimension volatilities of the stochastic processes. To keep the stochastic processes consistent with the loss process L(t), the following conditions must be satisfied PL(t),i (t, T )µi (t, T ) = σi (t, T )υL(t),i (t, T ),
0 ≤ i ≤ N − 1,
t ≤ T,
where, υn,m (t, T )’s are given by ⎧ ⎪ for ⎨0 0T for υi,j = Pnm (t,0 T ){− t σi (t, s)ds} ⎪ ⎩ 0 T e− sT λj (t,u)du [σ P a (t, s) − P (t, s)σ (t, s)]ds for ij j i,j−1 t
(32)
i>j i = j , (33) i<j
with Pa (t, T ) = Pi,j−1 (t, T ) + λm−1 (t, T )υn,m−1 (t, T ). σi,j−1
(34)
6.4 Default Correlation In the loss process model, the default correlations between companies can arise from both the transition rates of the loss process and the volatilities of the stochastic processes. To understand the default dependence by the transition rates, recall the concept of default correlation. The default correlation is the phenomenon of joint defaults and a clustering of defaults. After one or more companies defaults, the individual default intensities of the surviving companies increase. The dependence of individual default intensities on the default number (loss process L(t)) can be reflected by a proper selection of the transition rates λi (t), i = 0, 1, . . . , N − 1. This is the way that the transition rates can cause the default dependence between companies. The default dependence by the volatilities can be explained by considering the case of a one-dimension driving Brownian motion. For non-zero transition rate volatilities (35) σi (t, T ) > 0 for all 0 ≤ i ≤ N − 1, Brownian motion works like an indicator of the common market condition. If its value is positive, the market condition is bad and all the transition rates are larger; if its value is negative, the market condition is good and all the transition rates are smaller.
278
D. Wang et al.
6.5 Implementation of Dynamic Loss Process Model The model can be implemented by a Monte Carlo method. For pricing a CDO with a maturity T , the procedure is as follows: 1. Initial condition: t = 0, L(0) = 0 (p0 (0, 0) = 1), and specify λi (0, 0)’s and σi (0, .)’s. 2. Simulate a Brownian motion trial. 3. s → s + ∆s: (until s = T ) • Calculate P0,m (0, s) from (27), and υ0,j (0, s) from (33), and use them and σi (0, s) to calculate µi (0, s) from (32). • Calculate λi (0, s+∆s) using the Euler scheme and µi (0,s) and σi (0, s). • In a Euler scheme, calculate the loss distribution pi (0, s + ∆s) from (27) and using the representation of the loss distribution in (29). • The loss distribution pi (0, .) on the time period of (0, T ) is then calculated. 4. Repeat steps 2–4 until the average loss distributions pi (0, .) of all the trials converge. 5. Using the average loss distributions pi (0, .) to price a CDO. The loss process model can also be used to price other portfolio credit derivatives such as basket default swaps, options on CDS indexes, and options on CDS indexes tranches.
7 Models for Pricing Correlation Risk In this section, we give our suggestions for future research. It includes two parts. In the first part, we analyze the shortcoming of the one-factor double t copula model, and then propose four new heavy-tailed one-factor copula models. In the second part, we give our proposal for improving the structural model and the loss process model. 7.1 Heavy-Tailed Copula Models Hull and White [6] first use heavy-tailed distributions (Student’s t distributions) in a one-factor copula model. In their so-called one-factor double t copula model, Hull and White use the t distribution with ν degrees of freedom for the market component M and the individual components Zi ’s in equation (3). The degrees of freedom parameter ν of the t distribution can be 3, 4, 5, . . . . When the degrees of freedom parameter of ν is equal to 3, the copula function has the maximum tail-fatness. When the degrees of freedom parameter of ν increases, the tail-fatness of the copula function decreases. As mentioned before, Hull and White find that the double t copula model fits market data well when the degrees of freedom parameter ν is equal to 4. But the simulation by Kalemanova et al. [8] shows a different result. When
Pricing Tranches of a CDO and a CDS Index
279
Kalemanova et al. compare their model with the double t copula model, in addition to the simulation results by their own model, they also give the simulation results by the double t copula model for both the cases of the degrees of freedom parameter ν equal to 3 and 4. These simulation results show that the double t copula model fits market data better when ν = 3 than ν = 4. One difference in these two works is that different market data are used in the simulation. Hull and White use market data for the 5-year iTraxx Europe tranches on August 4, 2004, while Kalemanova et al. use market data on April 12, 2006. Therefore, the difference, related to how many degrees of freedom make the double t copula fit market data well, may suggest that for market data in different times, the double t copula model with different tail-fatnesses works well. The drawbacks of the double t copula are that its tail-fatness cannot be changed continuously and the maximum tail-fatness occurs when the degrees of freedom parameter ν is equal to 3. In order to fit market data well over time, it is necessary that the tail-fatness of a one-factor copula model can be adjusted continuously and can be much larger than the maximum tail-fatness of the one-factor double t copula model. In the following, we suggest four one-factor heavy-tailed copula models. Each model has (1) a tail-fatness parameter that can be changed continuously and (2) a maximum tail-fatness much larger than that of the one-factor double t copula model. One-Factor Double Mixture Gaussian Copula Model The mixture Gaussian distribution is a mixture distribution of two or more Gaussian distributions. For simplicity, we consider the case of the mixture distribution of two Gaussian distributions which have a zero mean. If the random variable Y is such a mixture Gaussian distribution, then it can be expressed as X1 with probability p, (36) Y = X2 with probability 1 − p, where X1 and X2 are independent normal Gaussian distributions with a zero mean (37) EX1 = EX2 = 0, V arX1 = σ12 and V arX2 = σ22 , with σ1 > σ2 . The mixture Gaussian distribution Y has a zero mean. Its variance is (38) V arY = pσ12 + (1 − p)σ22 . The pdf of the distribution Y is fY (y) = √
p y2 1−p y2 exp(− 2 ) + √ exp(− 2 ). 2σ1 2σ2 2πσ1 2πσ2
(39)
The mixture Gaussian distribution Y can be normalized by the following transition
280
D. Wang et al.
1 Y = 2 Y. σ1 + σ22
(40)
The pdf of Y is √
fY (y) =
, + 2 2 y (pσ1 +(1−p)σ22 ) exp − 2σ12 √ + 2 2 , (1−p) pσ12 +(1−p)σ22 y (pσ1 +(1−p)σ22 ) √ exp − + . 2 2σ 2πσ p
pσ12 +(1−p)σ22 √ 2πσ1
2
(41)
2
Using the standardized mixture Gaussian distribution in (41) as the distribution of the M and Zi ’s in (3), we obtain our first extension to the one-factor Gaussian copula model which we refer to as a double mixture Gaussian distribution copula model. In this model, the tail-fatness of the M and Z’s is determined by the parameters σ1 , σ2 , and p. In the implementation of the model, we can fix the parameters σ1 and σ2 , and make the parameter p the only parameter to control the tail-fatness of the copula function. One-Factor Double t Distribution with Fractional Degrees of Freedom Copula Model The pdf of the gamma(α, β) distribution is f (x|α, β) =
1 xα−1 exp(−x/β), Γ (α)β α
0 < x < ∞,
α > 0,
β > 0. (42)
Setting α = ν/2 and β = 2, we obtain an important special case of the gamma distribution, the Chi-square distribution, which has the following pdf: f (x|ν) =
1 xν−1 exp(−x/2), Γ (ν/2)2ν/2
0 < x < ∞,
ν > 0.
(43)
If the degrees of freedom parameter ν is an integer, equation (43) is the Chi-square distribution with ν degrees of freedom. However, the degrees of freedom parameter ν need not be an integer. When ν is extended to a positive real number, we get the Chi-square distribution with ν fractional degrees of freedom. If U is a standard normal distribution, V is a Chi-square distribution with ν fractional degrees of freedom, and U and V are independent, then T = U/ V /ν has the following pdf fT (t|ν) =
Γ ( ν+1 2 ) (1 + t2 /ν)−(ν+1)/2 , ν √ Γ ( 2 ) νπ
0 < x < ∞,
ν > 0.
(44)
This is the Student’s t distribution with ν fractional degrees of freedom (see [13]. Its mean and variance are respectively ET = 0,
ν > 1;
V arT =
ν , ν−2
ν > 2.
(45)
Pricing Tranches of a CDO and a CDS Index
281
For ν > 2, the Student’s t distribution in (44) can be normalized by making the transition (46) X = (ν − 2)/νT, ν > 2. The normalized Student’s t distribution with ν(ν > 2) fractional degrees of freedom has the following pdf @ Γ ( ν+1 ν x2 −(ν+1)/2 2 ) ) (1 + fX (x|ν) = , 0 < x < ∞, ν > 2. ν √ ν − 2 Γ ( 2 ) νπ ν−2 (47) Using the normalized Student’s t distribution with fractional degrees of freedom as the distribution of the M and Zi ’s in (3), we get our second extension to the one-factor Gaussian copula model which we refer to as a double t distribution with fractional degrees of freedom copula model. In this model, the tail-fatness of the M and Zi ’s can be changed continuously by adjusting the fractional degrees of freedom parameter ν. One-Factor Double Mixture Distribution of t and Gaussian Distribution Copula Model In the previous model, the tail-fatness of the M and Zi ’s is controlled by the fractional degrees of freedom parameter of the Student’s t distribution. Here, we introduce another distribution function for the M and Zi ’s, the mixture distribution of the Student’s t and the Gaussian distributions. Assume U is a normalized Student’s t distribution with fractional degrees of freedom, and V is a standard normal distribution. We can express a mixture distribution X as U with probability 1 − p, X= 0 ≤ p ≤ 1, (48) V with probability p, where p is the proportion of the Gaussian component in the mixture distribution X. The pdf of X is f (x) =
√p 2π
exp(−x2 /2) 4 Γ ( ν+1 2 ) √ +(1 − p) ν−2 (1 + ν νπΓ (ν/2)
x2 −(ν+1)/2 ν−2 )
,
(49)
where ν is the fractional degrees of freedom of the Student’s t distribution. Using the mixture distribution of Student’s t and Gaussian distributions in (3) as the distribution of the M and Zi ’s, we get our third extension to the one-factor Gaussian copula model which we refer to as a double mixture distribution of Student’s t and Gaussian distribution copula model. In this model, the tail-fatness of the M and Z’s is controlled by the parameter p when the parameter ν is fixed. One-Factor Double Smoothly Truncated Stable Copula Model In this part, we first introduce the stable distribution and the smoothly truncated stable distribution, and then provide our proposed model.
282
D. Wang et al.
Stable Distribution A non-trivial distribution g is a stable distribution if and only if for a sequence of independent, identical random variables Xi, , i = 1, 2, 3, . . . , n with a distribution g, the constants cn > 0 and dn can always be found for any n > 1 such that d cn (X1 + X2 + · · · + Xn ) + dn = X1 . In general, a stable distribution cannot be expressed in a closed form except for three special cases: Gaussian, Gauchy, and L´evy distributions. However, the characteristic function always exists and can be expressed in a closed form. For a random variable X with a stable distribution g, the characteristic function of the X can be expressed in the following form exp(−γ α |t|α [1 − iβsign(t) tan( πα 2 )] + iδt), α = 1 , ϕX (t) = E exp(itX) = exp(−γ|t|[1 + iβ π2 sign(t) ln(|t|)] + iδt), α=1 (50) where 0 < α ≤ 2, γ ≥ 0, −1 ≤ β ≤ 1, and −∞ ≤ δ ≤ ∞, and the function of sign(t) is 1 when t > 0, 0 when t = 0, and −1 when t < 0. There are four characteristic parameters to describe a stable distribution. They are: (1) the index of stability or the shape parameter α, (2) the scale parameter γ, (3) the skewness parameter β, and (4) the location parameter δ. A stable distribution g is called the α stable distribution and is denoted Sα (δ, β, σ) = S(α, σ, β, δ). The family of α stable distributions has three attractive properties: • • •
The sum of independent α stable distributions is still an α stable distribution, a property referred to as stability. α stable distributions can be skewed. Compared with the normal distribution, α stable distributions can have a fatter tail and a high peak around its center, a property which is referred to as leptokurtosis.
Real world financial market data indicate that assets returns tend to be fat-tailed, skewed, and peaked around center. For this reason α stable distributions have been a popular choice in modeling asset returns (see [19]). Smoothly Truncated α Stable Distribution One inconvenience of a stable distribution is that it has an infinite variance except in the case of α = 2. A new class of heavy-tailed functions is proposed by Menn and Rachev [15, 16]: smoothly truncated α stable distribution. A smoothly truncated α stable distribution is an α stable distribution with its two tails replaced by the tails of the Gaussian distribution. The pdf can be expressed as ⎧ ⎨ h1 (x) f or x < a, (51) f (x) = gθ (x) f or a ≤ δ ≤ b, ⎩ h2 (x) f or x > b,
Pricing Tranches of a CDO and a CDS Index
283
where hi (x), i = 1, 2 are the pdf of two normal distributions with means µi and standard deviations σi , and gθ (x) is the pdf of an α stable distribution with its parameter vector θ = (α, γ, β, δ). To secure a well-defined smooth probability distribution, the following regularities are imposed: h2 (b) h1 (a) 0= gθ (a), 0 a = gθ (b), a p1 := 0−∞ h1 (x)dx =0 −∞ gθ (x)dx, ∞ ∞ p2 := b h2 (x)dx = b gθ (x)dx, −1 (p1 )) σ1 = ψ(ϕgθ (a) , µ1 = a − σ1 ϕ−1 (p1 ), σ2 =
ψ(ϕ−1 (p2 )) , gθ (b)
(52)
µ2 = b + σ2 ϕ−1 (p2 ),
where ψ and ϕ denote the density and distribution functions of the standard normal distribution, respectively. A smoothly truncated α stable distribution [a,b] is referred to as an STS-distribution and denoted by Sα (γ, β, δ). The probabilities p1 and p2 are referred to as the cut-off probabilities. The real numbers a and b are referred to as the cut-off points. The family of STS-distributions has two important properties. The first is that it is closed under the scale and location transitions. This means that if the distribution X is an STS-distribution, then for c, d ∈ R, the distribution [a,b] Y := cX + d is an STS-distribution. If X follows Sα (γ, β, δ), then Y follows [ a,b] δ) with S ( γ , β, α
a = ca + d, b = cb + d, γ = |c|γ,
α = α, cδ + d α = 1, β = sign(c)β, δ = cδ − π2 c log |c|σβ + d α = 1,
(53)
The other important property of the STS-distribution is that with respect to an α stable distribution Sα (γ, β, δ), there is a unique normalized STS[a,b] distribution Sα (γ, β, δ) whose cut-off points a and b are uniquely determined by the four parameters α, γ, β, and δ. Because of the uniqueness of cutoff points, the normalized STS-distribution can be denoted by the NSTSdistribution Sα (γ, β, δ). One-Factor Double Smoothly Truncated Stable Copula Model In the one-factor copula model given in (3), using the NSTS-distribution Sα (γ, β, δ) for the distribution of the market component M and the individual components Zi ’s, we obtain the fourth extension to the one-factor Gaussian copula model. We refer to the model as a one-factor double smoothly truncated α stable copula model. In the model, we can fix the parameters γ, β, and δ, and make the parameter α the only parameter to control the tail-fatness of the copula function. When the parameter α = 2, the model becomes the onefactor Gaussian copula model. When α decreases, the tail-fatness increases.
284
D. Wang et al.
7.2 Suggestions for Structural Model and Loss Process Model The base-case structural model suggested by Hull et. al [7] can be an alternative method to the one-factor Gaussian copula model. The results of the two models are close. Consider the fact that the one-factor double t copula model fits market data much better than the one-factor Gaussian copula model according to Hull and White [6]. A natural way to enhance the structural model is by applying heavy-tailed distributions. Unlike the one-factor copula model, where any continuous distribution with a zero mean and a unit variance can be used, in the structural model there is a strong constraint imposed on the distribution of the underlying stochastic processes. The distribution for the common driving process M (t) and the individual driving process Zi ’s in (22) must satisfy a property of closure under summation. This means that if two independent random variables follow a given distribution, then the sum of these two variables still follow the same distribution. As explained earlier, the α stable distribution has this property and has been used in financial modeling (see [18]). We suggest using the α stable distribution in the structural model. The non-Gaussian α stable distribution has a drawback. Its variance does not exist. The STS distribution is a good candidate to overcome this problem. For a STS distribution, if the two cut-off points a and b are far away from the peak, the STS distribution is approximately closed under summation. Based on this, employing the STS distribution in the structural model should be the subject of future research. In the dynamic loss process model, the default intensities λi ’s follow stochastic processes as shown in (31). It is also a possible research direction to use the α stable distribution and the STS distribution for the driving processes.
8 Summary In this paper, we review three models for pricing portfolio risk: the one-factor copula model, the structural model, and the loss process model. We then propose how to improve these models by using heavy-tailed functions. For the one-factor copula model, we suggest using (1) a double mixture Gaussian copula, (2) a double t distribution with fractional copula, (3) a double mixture distribution of t and Gaussian distributions copula, and (4) a double smoothly truncated α stable copula. In each of these four new extensions to the onefactor Gaussian copula model, one parameter is introduced to control the tail-fatness of the copula function. To improve the structural and loss process models, we suggest using the stable distribution and the smoothly truncated stable distribution for the underlying stochastic driving processes.
Pricing Tranches of a CDO and a CDS Index
285
References [1] Amato J, Gyntelberg J (2005) CDS index tranches and the pricing of credit risk correlations. BIS Quarterly Review, March 2005, pp 73–87 [2] Bennani N (2005) The forward loss model: a dynamic term structure approach for the pricing of portfolio credit derivatives. Working paper, available at http://www.defaultrisk.com/pp crdrv 95.htm [3] Black F, Cox J (1976) Valuing corporate securities: some effects of bond indenture provision. The Journal of Finance, vol 31, pp 351–367 [4] Di Graziano G, Rogers C (2005) A new approach to the modeling and pricing of correlation credit derivatives. Working paper, available at www.defaultrisk.com/pp crdrv 88.htm [5] Duffie D (2004) Comments: irresistible reasons for better models of credit risk. Financial Times, April 16, 2004 [6] Hull J, White A (2004) Valuation of a CDO and nth to default CDS without Monte Carlo simulation. The Journal of Derivatives, vol 2, pp 8–23 [7] Hull J, Predescu M, White A (2005) The valuation of correlationdependent credit derivatives using a structural model. Working paper, Joseph L. Rotman School of Management, University of Toronto, available at http://www.defaultrisk.com/pp crdrv 68.htm [8] Kalemanova A, Schmid B, Werner R (2005) The normal inverse Gaussian distribution for synthetic CDO pricing. Working paper, available at http://www.defaultrisk.com/pp crdrv 91.htm [9] Laurent JP, Gregory J (2003) Basket default swaps, CDOs and factor copulas. Working paper, ISFA Actuarial School, University of Lyon, available at http://www.defaultrisk.com/pp crdrv 26.htm [10] Li DX (1999) The valuation of basket credit derivatives. CreditMetrics Monitor, April 1999, pp 34–50 [11] Li DX (2000) On default correlation: a copula function approach. The Journal of Fixed Income, vol 9, pp 43–54 [12] Lucas DJ, Goodman LS, Fabozzi FJ (2006) Collateralized debt obligations: structures and analysis, 2nd edn. Wiley Finance, Hoboken [13] Mardia K, Zemroch P (1978) Tables of the F- and related distributions with algorithms. Academic, New York [14] McGinty L, Beinstein E, Ahluwalia R, Watts M (2004) Credit correlation: a guide. Credit Derivatives Strategy, JP Morgan, London, March 12, 2004 [15] Menn C, Rachev S (2005) A GARCH option pricing model with alphastable innovations. European Journal of Operational Research, vol 163, pp 201–209 [16] Menn C, Rachev S (2005) Smoothly truncated stable distributions, GARCH-models, and option pricing. Working paper, University of Karlsruhe and UCSB, available at http://www.statistik. uni-karlsruhe.de/download/tr smoothly truncated.pdf
286
D. Wang et al.
[17] Merton R (1974) On the pricing of corporate debt: the risk structure of interest rates. The Journal of Finance, vol 29, pp 449–470 [18] Rachev S, Mittnik S (2000) Stable Paretian models in finance. John Wiley, Series in Financial Economics and Quantitative Analysis, Chichester [19] Rachev S, Menn C, Fabozzi FJ (2005) Fat-tailed and skewed asset return distributions: implications for risk management, portfolio selection, and option pricing. Wiley Finance, Hoboken [20] Sch¨ onbucher P (2005) Portfolio losses and the term structure of loss transition rates: a new methodology for the pricing of portfolio credit derivatives. Working paper, available at http://www.defaultrisk. com/pp model 74.htm [21] Sidenius J, Piterbarg V, Andersen L (2005) A new framework for dynamic credit portfolio loss modeling. Working paper, available at http://www.defaultrisk.com/pp model 83.htm