Outliers and Loss Functions

A few words about outliers

In statistics, outliers are as thorny topic as it gets. Is it legitimate to treat the observations seen during global financial crisis as outliers? or are those simply a feature of the system, and as such are integral part of a very fat tail distribution?

More

Density Confidence Interval

Density estimation belongs with the literature of non-parametric statistics. Using simple bootstrapping techniques we can obtain confidence intervals (CI) for the whole density curve. Here is a quick and easy way to obtain CI’s for different risk measures (VaR, expected shortfall) and using what follows, you can answer all kind of relevant questions.

More

Modeling Tail Behavior with EVT

Extreme Value Theory (EVT) and Heavy tails

Extreme Value Theory (EVT) is busy with understanding the behavior of the distribution, in the extremes. The extreme determine the average, not the reverse. If you understand the extreme, the average follows. But, getting the extreme right is extremely difficult. By construction, you have very few data points. By way of contradiction, if you have many data points then it is not the extreme you are dealing with.

More

Multivariate Volatility Forecast Evaluation

The evaluation of volatility models is gracefully complicated by the fact that, unlike other time series, even the realization is not observable. Two researchers would never disagree about what was yesterday’s stock price, but they can easily disagree about what was yesterday’s stock volatility. Because we don’t observe volatility directly, each of us uses own proxy of choice. There are many ways to skin this cat (more on volatility proxy here).

In a previous post Univariate volatility forecast evaluation we considered common ways in which we can evaluate how good is our volatility model, dealing with one time-series at a time. But how do we evaluate, or compare two models in a multivariate settings, with two covariance matrices?

More

Why bad trading strategies may perform well? Mathematical explanation

You probably know that even a trading strategy which is actually no different from a random walk (RW henceforth) can perform very well. Perhaps you chalk it up to short-run volatility. But in fact there is a deeper reason for this to happen, in force. If you insist on using and continuously testing a RW strategy, you will find, at some point with certainty, that it has significant outperformance.

This post explains why is that.

More

Multivariate volatility forecasting (3), Exponentially weighted model

Broadly speaking, complex models can achieve great predictive accuracy. Nonetheless, a winner in a kaggle competition is required only to attach a code for the replication of the winning result. She is not required to teach anyone the built-in elements of his model which gives the specific edge over other competitors. In a corporation settings your manager and his manager and so forth MUST feel comfortable with the underlying model. Mumbling something like “This artificial-neural-network is obtained by using a grid search over a range of parameters and connection weights where the architecture itself is fixed beforehand…”, forget it!

More

Correlation and correlation structure (2), copulas

This post is about copulas and heavy tails. In a previous post we discussed the concept of correlation structure. The aim is to characterize the correlation across the distribution. Prior to the global financial crisis many investors were under the impression that they were diversified, and they were, for how things looked there and then. Alas, when things went south, correlation in the new southern regions turned out to be different\stronger than that in normal times. The hard-won diversification benefits evaporated exactly when you needed them the most. This adversity has to do with fat-tail in the joint distribution, leading to great conceptual and practical difficulties. Investors and bankers chose to swallow the blue pill, and believe they are in the nice Gaussian world, where the math is magical and elegant. Investors now take the red pill, where the math is ugly and problems abound.

More

Correlation and correlation structure (1); quantile regression

Given a constant speed, time and distance are fully correlated. Provide me with the one, and I’ll give you the other. When two variables have nothing to do with each other, we say that they are not correlated.

You wish that would be the end of it. But it is not so. As it is, things are perilously more complicated. By far the most familiar correlation concept is the Pearson’s correlation. Pearson’s correlation coefficient checks for linear dependence. Because of it, we say it is a parametric measure. It can return an actual zero even when the two variables are fully dependent on each other (link to cool chart).

More

Multivariate volatility forecasting (1)

Introduction

When hopping from univariate volatility forecasts to multivariate volatility forecast, we need to understand that now we have to forecast not only the univariate volatility element, which we already know how to do, but also the covariance elements, which we do not know how to do, yet. Say you have two series, then this covariance element is the off-diagonal of the 2 by 2 variance-covariance matrix. The precise term we should use is “variance-covariance matrix”, since the matrix consists of the variance elements on the diagonal and the covariance elements on the off-diagonal. But since it is very tiring to read\write “variance-covariance matrix”, it is commonly referred to as the covariance matrix, or sometimes less formally as var-covar matrix.

More

Volatility forecast evaluation in R

In portfolio management, risk management and derivative pricing, volatility plays an important role. So important in fact that you can find more volatility models than you can handle (Wikipedia link). What follows is to check how well each model performs, in and out of sample. Here are three simple things you can do:

More