R - 4/4 - Eran Raviv

A shrinkage estimator for beta

Blog, Finance and Trading, R, RiskPosted on 08/28/2012

In the post pairs trading issues one of the problems raised was the unstable estimates of the stock’s beta with respect to the market. Here is a suggestion for a possible solution, which is not really a solution but more stuff to do to make you feel less stupid when trading based on your fragile estimates.

Forecasting the Eurozone Misery index

Blog, Finance and Trading, R, Statistics and EconometricsPosted on 05/23/2012

Is Miss Stagflation coming to visit?
The Misery index is the sum of inflation and unemployment rate. We would like them both to stay naturally low, and we are miserable when they are not. The index is currently floating in it’s record scratching levels. In this post I demonstrate the use of the nice FitAR package in R to fit an AR model and see what we can expect accordingly. Inflation and unemployment numbers concerning the Eurozone (17 countries) can be found here.
Have a look at the index over time:

Kurtosis Interpretation

Blog, R, Statistics and EconometricsPosted on 05/07/2012

When you google “Kurtosis”, you encounter many formulas to help you calculate it, talk about how this measure is used to evaluate the “peakedness” of your data, maybe some other measures to help you do so, maybe all of a sudden a side step towards Skewness, and how both Skewness and Kurtosis are higher moments of the distribution. This is all very true, but maybe you just want to understand what does Kurtosis mean and how to interpret this measure. Similarly to the way you interpret standard deviation (the average distance from the average). Here I take a shot at giving a more intuitive interpretation.

Marriage is good for your income

Blog, Miscellaneous, R, Statistics and EconometricsPosted on 04/29/2012

For those of you who are into machine learning, here you can find a cool collection of databases to play around with your favorite algorithm. I choose one out of the available 200 and fit a logistic regression model. The idea is to see what kind of properties are common for those who earn above 50K a year. Our data is such that the “y” variable is binary. A value of 1 is given if the individual earns above 50K and 0 if below. We know many things about the individual. Level of education in years, age, is she married, where from, which sector is she working in, how many working hours per week, race, and more. We can fit logistic regression, which is quite standard for a binary dependent variable, and see which variables are important.

Backtesting trading strategies with R

Blog, Finance and Trading, RPosted on 04/21/2012

Few weeks back I gave a talk about Backtesting trading strategies with R, got a few requests for the slides so here they are:

Most profitable hedge fund style

Blog, Finance and Trading, RPosted on 04/21/2012

This is not an investment advice!!

Couple of weeks back, during amst-R-dam user group talk on backtesting trading strategies using R, I mentioned the most effective style for hedge funds is relative value statistical arbitrage, I read it somewhere. After the talk was over, I was not sure anymore if it was correct to say it and decided to check it.

Bootstrap example

Blog, R, Statistics and EconometricsPosted on 03/30/2012

Bootstrap your way into robust inference. Wow, that was fun to write..

Introduction
Say you made a simple regression, now you have your $\widehat{\beta}$ . You wish to know if it is significantly different from (say) zero. In general, people look at the statistic or p.value reported by their software of choice, (heRe). Thing is, this p.value calculation relies on the distribution of your dependent variable. Your software assumes normal distribution if not told differently, how so? for example, the (95%) confidence interval is $\widehat{\beta} \pm 1.96 \times sd( \widehat{\beta})$ , the 1.96 comes from the normal distribution.
It is advisable not to do that, the beauty in bootstrapping* is that it is distribution untroubled, it’s valid for dependent which is Gaussian, Cauchy, or whatever. You can defend yourself against misspecification, and\or use the tool for inference when the underlying distribution is unknown.

Europe most dangerous cities

Blog, RPosted on 03/15/2012

When I was searching for data about U.S prison population, for another post, I ran across eurostat, a nice source for data to play around with. I pooled some numbers, specifically homicides recorded by the police. A panel data for 36 cities over time, from 2000 to 2009. Lets see which are the cities that have problems in this area.

Piecewise Regression

Blog, R, Statistics and EconometricsPosted on 02/11/2012

A beta of a stock generally means its relation with the market, how many percent move we should expect from the stock when the market moves one percent.

Market, being a somewhat vague notion is approximated here, as usual, using the S&P 500. This aforementioned relation (henceforth, beta) is detrimental to many aspects of trading and risk management. It is already well established that volatility has different dynamics for rising markets and for declining market. Recently, I read few papers that suggest the same holds true for beta, specifically that the beta is not the same for rising markets and for declining markets. We anyway use regression for estimation of beta, so piecewise linear regression can fit right in for an investor/speculator who wishes to accommodate himself with this asymmetry.

Resistant Regression

Blog, R, Statistics and EconometricsPosted on 01/29/2012

It is a fact that on most days, not much is going on in the stock market. When we estimate the relation of a stock with the market, or the “beta” of a stock, we use all available daily returns. This might not be wise as some days are not really typical and contaminate our estimate. For example, Steve Jobs past away recently, AAPL moved quite a bit as a result. However, this is a distinct event that does not reflect on the relation with the market, but is company specific. Our aim is to exclude such observations, taking into consideration that we don’t want to lose too much information, not all large swings are irrelevant.

Pairs Trading Issues

Blog, Finance and Trading, R, Statistics and EconometricsPosted on 12/18/2011

A few words for those of you who are not familiar with the “pairs trading” concept. First you should understand that the movement of every stock is dominated not by the companies performance but by the general market movement. This is the origin of many “factor models”, the factor that drives the every stock is the market factor, which is approximated by the S&P index in most cases.

What is important for a loan?

Blog, Miscellaneous, R, Statistics and EconometricsPosted on 11/06/2011

Few months back I read this post, which referred to this amazing data set. The numbers are for individuals who borrowed money, amounts, term and conditions of the loans and much more. Most of the people naturally paid back the loan in full, however, some did not,

OLS beta VS. Robust beta

Blog, Finance and Trading, R, Statistics and EconometricsPosted on 10/03/2011

In financial context, $\beta$ is suppose to reflect the relation between a stock and the general market. A broad based index such as the S&P 500 is often taken as proxy for the general market. The $\beta$, without getting into too much detail, is estimated using the regression: $$stock_i = \beta_0+\beta_1market_i+e_i$$ A $\widehat{\beta_1}$ of say, 1.5 means that when the market goes up 1% the specific stock goes up 1.5%. (Ignoring all the biases at the moment!)