Extreme Value Theory (EVT) and Heavy tails
Extreme Value Theory (EVT) is busy with understanding the behavior of the distribution, in the extremes. The extreme determine the average, not the reverse. If you understand the extreme, the average follows. But, getting the extreme right is extremely difficult. By construction, you have very few data points. By way of contradiction, if you have many data points then it is not the extreme you are dealing with.
Risk-management is enjoying a lot of progress in many fronts: in density estimation, in volatility modelling, in quantile estimation (Value at Risk) and I recently started to connect those topics with another strand of literature, that of tail estimation. What is Extreme Value Theory if not tail estimation? An what is tail estimation if not a concentrated effort to identify the shape in a particular region of a distribution?
When I think heavy tails, my go-to is the Kurtosis measure. For this post I pull the SPY ticker from Yahoo, calculate daily returns, group them according to years and calculate the Kurtosis measure per year. Have a look at the result:
The bar chart shows the excess Kurtosis; excess over normal which has Kurtosis of 3. Some years have very high, other exhibit quite similar Kurtosis to that of Normal distribution. All are above 3, which indicates heavy tails (shocking eh?).
But Kurtosis has to do with the whole distribution, not exactly with the tails.
We would like to know more about the tail precisely. One reason may be that perhaps there is a difference between the left tail and the right tail. Plus, we want to see how the tails look like, it can help us feel more (un)comfortable with our Value at Risk (VaR) estimates. The most natural place to start is with kernel density estimation using the
density function. We then zoom-in on the left and right tails of the distributions. Here it is:
From this it seems that the tail density is time varying. A side note here, in the literature you can find the term ‘tail-distribution’ used quite often. I follow this convention as well. But, perhaps this convention should not have been adopted. ‘Tail-density’ should have been used instead. The tail is part of the whole distribution, to me it seems that assigning it a ‘distribution’ term on its own is unwarranted and avoidably confusing. The convention should be something like: “how the tail density behaves”.
Observe that the left and right tails are different than each other. Take for example the ugly 2008, you can see the “lump” between -4% and -6%, there is a “lump” on the right tail as well, but it is at around 3.5-4%. You can see the v-shape recovery in 2009, with fatter right tail.
With this figure you can also understand why Expected shortfall is gaining strength with regulators and among serious risk managers. Expected shortfall should be preferred over the more widely used Value-at-Risk. The estimated return distributions are quite lumpy, and the real unknown distribution can be lumpy also. The Expected shortfall measure talks expectation, instead of a quantile snapshot as the VaR does. By computing (conditional) expectation we can account for those increased probabilities (lumps) cropping up after the VaR-quantile level we have chosen.
Taking it to the extremes 2.0
Recently Eric Gilleland and Richard W. Katz published an excellent package in the Journal of Statistical Software. Both coming from weather and climate research. The paper, extRemes 2.0: An Extreme Value Analysis Package in R, is an excellent read for those of you who want to know more about tail estimation in R. The math is there for all to enjoy, but the illustrations makes the paper accessible for everyone.
Let’s give you a flavor of what this impressive package can do.
We start with a simulation from a normal distribution with the same mean and standard deviation as the original data, so as to have some anchor for (tail) comparison. We then use the
fevd function to estimate both the right and left tails of the 10 years daily return distribution.
Eric Gilleland, Richard W. Katz (2016). extRemes 2.0: An Extreme Value Analysis Package in R.
Journal of Statistical Software, 72(8), 1-39.
# Simulate normal
simnorm <- rnorm(length(retd), mean(retd), sd(retd))
# Estimate generalized pareto distribution for the tail using Maximum Likelihood
# I use 1.65 as a threshold
tmpfit <- fevd(simnorm, threshold= 1.65, type = "GP", use.phi= T, method = c("MLE"))
# Estimate generalized pareto distribution for the left tail using Maximum Likelihood
tmp_threshold <- quantile(100*retd, .1) # set the Threshold
# Estimate generalized pareto distribution for the right tail using Maximum Likelihood
tmpfit <- fevd(-100*retd, threshold= -tmp_threshold, type = "GP", use.phi= T, method = c("MLE"))
tmp_threshold <- quantile(100*retd, .9) # set the Threshold
tmpfit <- fevd(100*retd, threshold= tmp_threshold, type = "GP", use.phi= T, method = c("MLE"))
Tail estimation for the normal simulation
There is no need to estimate the left tail since normal distribution is symmetric. The tail of the normal distribution looks, well, normal. From the bottom right chart we can see that the estimated quantiles are aligned with the empirical quantiles.
Right tail estimation
Turning attention to the daily returns distribution, the estimate for the right tail declines quite sharply, but has couple of very positive returns which such a smooth density estimate simply cannot handle. This can be seen by the weird two observations in the quantile-quantile plot at the bottom right. Up until 6%, the fit is excellent.
Left tail estimation
Now, the density estimate for the left tail clearly demonstrates why checking the Kurtosis only is somewhat detracting. The left tail is quite different from the right one. Especially after the 3% level, the probability of the right tail declines very quickly while the probability in the left tail declines only slowly.
Estimating the Extremal Index
Extremal Index is a measure for the auto-dependence in the tails. Similarly to volatility clustering, we can think of extreme clustering. Extremal Index of one means absolutely independent extreme observations, while Extremal Index below one implies that extremes tend to occur in clusters. Reference at the bottom of the post.
We can use the
extremalindex function in order to estimate auto-dependence in the left and right tail of the daily return distributions. Here are the results:
Clustering means large changes are followed by more large changes. We see that the left tail is more clustered than the right tail (Extremal Index below one). You can think about the Extremal index as a non-linear autocorrelation measure. We see that in the extremes, negative observations are more “autocorrelated” than for the positive observations. This is not to be confused with the leverage effect, which says negative shocks have larger impact on volatility than positive shocks, it is more of a complement which relates to the extremes only.
Much more is possible with the package.
As an aside, when I was playing around with the package I started by estimating the shape parameter in a generalized Pareto distribution. The shape parameter determines how the tail behaves. The functions work perfectly, and the JSS paper helps to understand what are they doing. However, my intention was to compare the behavior using those estimates of the shape parameter; I could not. The reason is that the other parameters, location and scale, are not fixed, so we compare apples and oranges.
I expect this package to be very useful in the future. In climate and Atmospheric Research research surely, but also in financial risk management. Well done.
An Introduction to Statistical Modeling of Extreme Values (this book is heavily cited as a reference)
Code for pulling the data:
library(quantmod) ; citation("quantmod")
symetf = c('SPY')
l = length(symetf)
dat0 <- lapply(symetf, getSymbols,src="yahoo", from=start, to=end,
auto.assign = F,warnings = FALSE,symbol.lookup = F)
xd <- dat0[]
timee <- index(xd)
retd <- as.numeric(xd[2:NROW(xd),4])/as.numeric(xd[1:(NROW(xd)-1),4]) -1