Test of Equality Between Two Densities

Are returns this year actually different than what can be expected from a typical year? Is the variance actually different than what can be expected from a typical year? Those are fairly light, easy to answer questions. We can use tests for equality of means or equality of variances.
But how about the following question:

is the profile\behavior of returns this year different than what can be expected in a typical year?

This is a more general and important question, since it encompasses all moments and tail behavior. And it is not as trivial to answer.

In this post I am scratching an itch I had since I wrote Understanding Kullback – Leibler Divergence. In the Kullback – Leibler Divergence post we saw how to quantify the difference between densities, exemplified using SPY return density per year. Once I was done with that post I was thinking there must be a way to test the difference formally, rather than just quantify, visualize and eyeball. And indeed there is. This post aim is to show to formally test for equality between densities.

More

Orthogonality in Statistics

Orthogonality in mathematics

The word Orthogonality originates from a combination of two words in ancient Greek: orthos (upright), and gonia (angle). It has a geometrical meaning. It means two lines create a 90 degrees angle between them. So one line is perpendicular to the other line. Like so:

Perpendicular
Source: Wikipedia

Even though Orthogonality is a geometrical term, it appears very often in statistics. You probably know that in a statistical context orthogonality means uncorrelated, or linearly independent. But why?

Why use a geometrical term to describe a statistical relation between random variables? By extension, why does the word angle appears in the incredibly common regression method least-angle regression (LARS)? Enough losing sleep over it (as you undoubtedly do), an extensive answer below.

More

Market intraday momentum

I recently spotted the following intriguing paper: Market intraday momentum.
From the abstract of that paper:

Based on high frequency S&P 500 exchange-traded fund (ETF) data from 1993–2013, we show an intraday momentum pattern: the first half-hour return on the market as measured from the previous day’s market close predicts the last half-hour return. This predictability, which is both statistically and economically significant is stronger on more volatile days, on higher volume days, on recession days, and on major macroeconomic news release days.

Nice! Looks like we can all become rich now. I mean, given how it’s written, it should be quite easy for any individual with a trading account and a mouse to leverage up and start accumulating. Maybe this is so, but let’s have an informal closer look, with as little effort as possible, and see if there is anything we can say about this idea.

More

Curse of dimensionality part 3: Higher-Order Comoments

Higher moments such as Skewness and Kurtosis are not as explored as they should be.

These moments are crucial for managing portfolio risk. At least as important as volatility, if not more. Skewness relates to asymmetry risk and Kurtosis relates to tail risk.

Despite their great importance, those higher moments enjoy only a small portion of attention compared with their lower more friendly moments: the mean and the variance. In my opinion, one reason for this may be the impossibility of estimating those moments, estimating them accurately that is.

It is yet another situation where Curse of Dimensonality rears its enchanting head (and an idea for a post is born..).

More

Understanding Kullback – Leibler Divergence

It is easy to measure distance between two points. But what about measuring distance between two distributions? Good question. Long answer. Welcome the Kullback – Leibler Divergence measure.

The motivation for thinking about the Kullback – Leibler Divergence measure is that you can pick up questions such as: “how different was the behavior of the stock market this year compared with the average behavior?”. This is a rather different question than the trivial “how was the return this year compared to the average return?”.

More

Statistical Shrinkage

Machine estimated reading time:


Shrinkage in statistics has increased in popularity over the decades. Now statistical shrinkage is commonplace, explicitly or implicitly.

But when is it that we need to make use of shrinkage? At least partly it depends on signal-to-noise ratio.

More

Outliers and Loss Functions

A few words about outliers

In statistics, outliers are as thorny topic as it gets. Is it legitimate to treat the observations seen during global financial crisis as outliers? or are those simply a feature of the system, and as such are integral part of a very fat tail distribution?

More

Density Confidence Interval

Density estimation belongs with the literature of non-parametric statistics. Using simple bootstrapping techniques we can obtain confidence intervals (CI) for the whole density curve. Here is a quick and easy way to obtain CI’s for different risk measures (VaR, expected shortfall) and using what follows, you can answer all kind of relevant questions.

More

Modeling Tail Behavior with EVT

Extreme Value Theory (EVT) and Heavy tails

Extreme Value Theory (EVT) is busy with understanding the behavior of the distribution, in the extremes. The extreme determine the average, not the reverse. If you understand the extreme, the average follows. But, getting the extreme right is extremely difficult. By construction, you have very few data points. By way of contradiction, if you have many data points then it is not the extreme you are dealing with.

More

Good coding practices – part 2

Introduction

In part 1 of Good coding practices we considered how best to code for someone else, may it be a colleague who is coming from Excel environment and is unfamiliar with scripting, a collaborator, a client or the future-you, the you few months from now. In this second part, I give some of my thoughts on how best to write functions, the do’s and dont’s.

More

Forecast averaging example

Especially in economics/econometrics, modellers do not believe their models reflect reality as it is. No, the yield curve does NOT follow a three factor Nelson-Siegel model, the relation between a stock and its underlying factors is NOT linear, and volatility does NOT follow a Garch(1,1) process, nor Garch(?,?) for that matter. We simply look at the world, and try to find an apt description of what we see.

More

Measurement error bias

What is measurement error bias?

Errors-in-variables, or measurement error situation happens when your right hand side variable(s); your $x$ in a $y_t = \alpha + \beta x_t + \varepsilon_t$ model is measured with error. If $x$ represents the price of a liquid stock, then it is accurately measured because the trading is so frequent. But if $x$ is a volatility, well, it is not accurately measured. We simply don’t yet have the power to tame this variable variable.

Unlike the price itself, volatility estimates change with our choice of measurement method. Since no model is a perfect depiction of reality, we have a measurement error problem on our hands.

Ignoring measurement errors leads to biased estimates and, good God, inconsistent estimates.

More

The case for Regime-Switching GARCH

GARCH models are very responsive in the sense that they allow the fit of the model to adjust rather quickly with incoming observations. However, this adjustment depends on the parameters of the model, and those may not be constant. Parameters’ estimation of a GARCH process is not as quick as those of say, simple regression, especially for a multivariate case. Because of that, I think, the literature on time-varying GARCH is not yet at its full speed. This post makes the point that there is a need for such a class of models. I demonstrate this by looking at the parameters of Threshold-GARCH model (aka GJR GARCH), before and after the 2008 crisis. In addition, you can learn how to make inference on GARCH parameters without relying on asymptotic normality, i.e. using bootstrap.

More