If you are reading this, you already know that the covariance matrix represents **unconditional** linear dependency between the variables. Far less mentioned is the bewitching fact that the elements of the *inverse* of the covariance matrix (i.e. the precision matrix) encode the **conditional** linear dependence between the variables. This post shows why that is the case. I start with the motivation to even discuss this, then the math, then some code.

# Category: Finance and Trading

## Randomized Matrix Multiplication

Matrix multiplication is a fundamental computation in modern statistics. It’s at the heart of all concurrent serious AI applications. The size of the matrices nowadays is gigantic. On a good system it takes around 30 seconds to estimate the covariance of a data matrix with dimensions $X_{10000 \times 2500}$, a small data today’s standards mind you. Need to do it 10000 times? wait for roughly 80 hours. Have larger data? running time grows exponentially. Want a more complex operation than covariance estimate? forget it, or get ready to dig deep into your pockets.

We, mere minions who are unable to splurge thousands of dollars for high-end G/TPUs, are left unable to work with large matrices due to the massive computational requirements needed; because who wants to wait few weeks to discover their bug.

This post offers a solution by way of approximation, using randomization. I start with the idea, followed by a short proof, and conclude with some code and few run-time results.

## Statistical Shrinkage (4) – Covariance estimation

A common issue encountered in modern statistics involves the inversion of a matrix. For example, when your data is sick with multicollinearity your estimates for the regression coefficient can bounce all over the place.

In finance we use the covariance matrix as an input for portfolio construction. Analogous to the fact that variance must be positive, covariance matrix must be positive definite to be meaningful. The focus of this post is on understanding the underlying issues with an unstable covariance matrix, identifying a practical solution for such an instability, and connecting that solution to the all-important concept of statistical shrinkage. I present a strong link between the following three concepts: regularization of the covariance matrix, ridge regression, and measurement error bias, with some easy-to-follow math.

## Beware of Spurious Factors

The word spurious refers to “outwardly similar or corresponding to something without having its genuine qualities.” Fake.

While the meanings of spurious correlation and spurious regression are common knowledge nowadays, much less is understood about spurious factors. This post draws your attention to recent, top-shelf, research flagging the risks around spurious factor analysis. While formal solutions are still pending there are couple of heuristics we can use to detect possible problems.

## Beta in the tails

Every form of strength is also a form of weakness^{*}. I love statistics, but I focus to much on methodology, which is not for everyone. Some people (right or wrong) question: “wonderful sir, but what can I do with it?”.

A new paper titled *“Beta in the tails”* is a showcase application for why we should focus on correlation structure rather than on average correlation. They discuss the question: *Do hedge funds hedge?* The reply: No, they don’t!

The paper *“Beta in the tails”* was published in the *Journal of Econometrics* but you can find a link to a working paper version below. We start with a figure replicated from the paper, go through the meaning and interpretation of it, and explain the methods used thereafter.

## Correlation and correlation structure (4) – asymmetric correlations of equity portfolios

Here I share a refreshing idea from the paper “Asymmetric correlations of equity portfolios” which was published in the *Journal of financial Economics*, a top tier journal in this field. The question is how much the observed conditional correlation on the downside (say) differs from the conditional correlation you would expect from a symmetrical distribution. You can find here an explanation for the H-statistic developed in the aforementioned paper and some code for illustration.

## Robust Moving Average

Moving average is one of the most commonly used smoothing method, basically the go-to. It helps us detect trend in the data by smoothing out short term fluctuations. The computation is trivial: take the most recent k points and simple-average them. Here is how it looks:

## Portfolio Construction Tilting towards Higher Moments

When you build your portfolio you must decide what is your risk profile. A pension fund’s risk profile is different than that of a hedge fund, which is different than that of a family office. Everyone’s goal is to maximize returns given the risk. Sinfully but commonly risk is defined as the variability in the portfolio, and so we feed our expected returns and expected risk to some optimization procedure in order to find the optimal portfolio weights. Risk serves as a decision variable. You choose the risk, and (hope to) get the returns.

A new paper from Kris Boudt, Dries Cornilly, Frederiek Van Hollee and Joeri Willems titled Algorithmic Portfolio Tilting to Harvest Higher Moment Gains makes good progress in terms of our definition of risk, and risk-return trade-off. They propose a quantified way in which you can adjust your portfolio to account not only for the variance, but also for higher moments, namely skewness and kurtosis. They do that in two steps. The first is to simply set your portfolio based on whichever approach you follow (e.g. minvol, equal risk contribution or other). In the second step you tilt the portfolio such that the higher moments are brought into focus and get the attention they deserve. This is done by deviating from the original optimization target so that higher moments are utility-improved: less variance, better skew and lower kurtosis.

## Adaptive Huber Regression

Many years ago, when I was still trying to beat the market, I used to pair-trade. In principle it is quite straightforward to estimate the correlation between two stocks. The estimator for beta is very important since it determines how much you should long the one and how much you should short the other, in order to remain market-neutral. In practice it is indeed very easy to estimate, but I remember I never felt genuinely comfortable with the results. Not only because of instability over time, but also because the Ordinary Least Squares (OLS from here on) estimator is theoretically justified based on few text-book assumptions, most of which are improper in practice. In addition, the OLS estimator it is very sensitive to outliers. There are other good alternatives. I have described couple of alternatives here and here. Here below is another alternative, provoked by a recent paper titled *Adaptive Huber Regression*.

## Day of the week and the cross-section of returns

I just finished reading an interesting paper by Justin Birru titled: “Day of the week and the cross-section of returns” (reference below). The story is much too simple to be true, but it looks to be so. In fact, I would probably altogether skip it without the highly ranked *Journal of Financial Economics* stamp of approval. However, by the end of the paper I was as convinced as one can be without actually running the analysis.

## Create own Recession Indicator using Mixture Models

## Context

Broadly speaking, we can classify financial markets conditions into two categories: Bull and Bear. The first is a “todo bien” market, tranquil and generally upward sloping. The second describes a market with a downturn trend, usually more volatile. It is thought that those bull\bear terms originate from the way those animals supposedly attack. Bull thrusts its horns up while a bear swipe its paws down. At any given moment, we can only guess the state in which we are in, there is no way of telling really; simply because those two states don’t have a uniformly exact definitions. So basically we never actually observe a membership of an observation. In this post we are going to use (finite) mixture models to try and assign daily equity returns to their bull\bear subgroups. It is essentially an unsupervised clustering exercise. We will create our own recession indicator to help us quantify if the equity market is contracting or not. We use minimal inputs, nothing but equity return data. Starting with a short description of Finite Mixture Models and moving on to give a hands-on practical example.

## Price Movement Prediction – another paper

Just finished reading the paper Stock Market’s Price Movement Prediction With LSTM Neural Networks. The abstract attractively reads: “The results that were obtained are promising, getting up to an average of 55.9% of accuracy when predicting if the price of a particular stock is going to go up or not in the near future.”, I took the bait. You shouldn’t.

## Market intraday momentum

I recently spotted the following intriguing paper: Market intraday momentum.

From the abstract of that paper:

Based on high frequency S&P 500 exchange-traded fund (ETF) data from 1993–2013, we show an intraday momentum pattern: the first half-hour return on the market as measured from the previous day’s market close predicts the last half-hour return. This predictability, which is both statistically and economically significant is stronger on more volatile days, on higher volume days, on recession days, and on major macroeconomic news release days.

Nice! Looks like we can all become rich now. I mean, given how it’s written, it should be quite easy for any individual with a trading account and a mouse to leverage up and start accumulating. Maybe this is so, but let’s have an informal closer look, with as little effort as possible, and see if there is anything we can say about this idea.

## R in Finance highlights

The yearly *R in Finance* conference is one of my favorites:

## Curse of dimensionality part 3: Higher-Order Comoments

Higher moments such as Skewness and Kurtosis are not as explored as they should be.

These moments are crucial for managing portfolio risk. At least as important as volatility, if not more. Skewness relates to asymmetry risk and Kurtosis relates to tail risk.

Despite their great importance, those higher moments enjoy only a small portion of attention compared with their lower more friendly moments: the mean and the variance. In my opinion, one reason for this may be the impossibility of estimating those moments, estimating them accurately that is.

It is yet another situation where Curse of Dimensonality rears its enchanting head (and an idea for a post is born..).