Eran Raviv

Advances in post-model-selection inference

Blog, Statistics and EconometricsPosted on 09/23/2014

Along with improvements in computational power, variable selection has become one of the problems attracting the most effort. We (well.. experts) have made huge leaps in the realm of variable selection. Prediction being probably the most common objective. LASSO (Least Absolute Sum of Squares Operator) leading the way from the west (Stanford) with its many variations (Adaptive, Random, Relaxed, Fused, Grouped, Bayesian.. you name it), SCAD (Smoothly Clipped Absolute Deviation) catching up from the east (Princeton). With the good progress in that area, not secondary but has been given less attention -> Inference is now being worked out.

PCA as regression

Blog, Statistics and EconometricsPosted on 09/17/2014

A way to think about principal component analysis is as a matrix approximation. We have a matrix $X_{T \times P}$ and we want to get a ‘smaller’ matrix $Z_{T \times K}$ with $K<P$ . We want the new ‘smaller’ matrix to be close to the original despite its reduced dimension. Sometimes we say ‘such that Z capture the bulk of comovement in X. Big data technology is such that nowadays the number of cross sectional units (number of columns in X) P has grown to be very large compared to the sixties say. Now, with ‘google maps would like to use your current location’ and future ‘google fridge would like to access your amazon shopping list’, you can count on P growing exponentially, we are just getting started. A lot of effort goes into this line of research, and with great leaps.

On the nonfarm payroll number

Blog, Finance and TradingPosted on 09/14/2014

The total nonfarm payroll accounts for approximately 80% of the workers who produce the GDP of the United States. Despite the widely acknowledged fact that the Nonfarm payroll is highly volatile and is heavily revised, it is still driving both bonds and equity market moves before- and after it is published. The recent number came at a weak 142K compared with around 200K average over the past 12M. What we wish we would know now, but will only know later, is whether this number is a start of a weaker expansion in the workforce, or not.
Despite the fact that it is definitely on the weak side (as you can see in the top panel of the figure), it is nothing unusual (as you can see in the bottom panel of the figure).

The bottom panel charts the interval you have before the number is publish (forecast intervals) from a simple AR(1) model without imposing normality. The blue and the red lines are 1 and 2 standard deviations respectively. The recent number barely scratches the bottom blue, so nothing to suggest a significant shift from a healthy 200K. On the other hand, there is some persistence:


ar.ols(na.omit(nfp))

Call:
ar.ols(x = na.omit(nfp))

Coefficients:
      1        2        3        4        5        6  
 0.2633   0.2672   0.1402   0.0841   0.1015  -0.0853  

Intercept: 0.318 (5.906) 

Order selected 6  sigma^2 estimated as  31430

ar.ols(na.omit(nfp))

Call:

ar.ols(x = na.omit(nfp))

Coefficients:

1 2 3 4 5 6

0.2633 0.2672 0.1402 0.0841 0.1015 -0.0853

Intercept: 0.318 (5.906)

Order selected 6 sigma^2 estimated as 31430

So, on average we can expect to trend lower.

Code for figure:


library(quantmod)
library(Eplot)

tempenv <- new.env() 
getSymbols("PAYEMS",src="FRED",env=tempenv)
# Bring it to global env
head(tempenv$PAYEMS)
time <- index(tempenv$PAYEMS)
nfp <- as.numeric(diff(tempenv$PAYEMS))
par(mfrow = c(2,1))
k = 24
args(plott)
plott(tail(nfp,k),tail(time,k),return.to.default = F,main="NFP-changes")
args(FCIplot)
nfpsd <- FCIplot(nfp,k=k,rrr1="Rol",rrr2="Rol",main="NFP-changes; forecast intervals superimposed")

library(quantmod)

library(Eplot)

tempenv <- new.env()

getSymbols("PAYEMS",src="FRED",env=tempenv)

# Bring it to global env

head(tempenv$PAYEMS)

time <- index(tempenv$PAYEMS)

nfp <- as.numeric(diff(tempenv$PAYEMS))

par(mfrow = c(2,1))

k = 24

args(plott)

plott(tail(nfp,k),tail(time,k),return.to.default = F,main="NFP-changes")

args(FCIplot)

nfpsd <- FCIplot(nfp,k=k,rrr1="Rol",rrr2="Rol",main="NFP-changes; forecast intervals superimposed")

Eplot (1)

Blog, Code, RPosted on 09/08/2014

Package Eplot on cran.

R vs MATLAB (Round 3)

Blog, Code, Miscellaneous, MiscTips, RPosted on 08/28/2014

At least for me, R by faR. MATLAB has its own way of doing things, which to be honest can probably be defended from many angles. Here are few examples for not so subtle differences between R and MATLAB:

Mom, are we bear yet?

Blog, Finance and TradingPosted on 08/09/2014

One way to help us decide is to estimate a regime switching model for the VIX, see if the volatility crossed over to the bear regime.

Non-linear beta

Blog, Finance and Trading, RiskPosted on 07/28/2014

If you google-finance AMZN you can see the beta is 0.93. I already wrote in the past about this illusive concept. Beta is suppose to reflect the risk of an instrument with respect for example to the market. However, you can estimate this measure in all kind of ways.

Bias vs. Consistency

Blog, Statistics and EconometricsPosted on 06/02/2014

Especially for undergraduate students but not just, the concepts of unbiasedness and consistency as well as the relation between these two are tough to get one’s head around. My aim here is to help with this. We start with a short explanation of the two concepts and follow with an illustration.

Bootstrap Critisim (example)

Blog, Statistics and EconometricsPosted on 05/14/2014

In a previous post I underlined an inherent feature of the non-parametric Bootstrap, it’s heavy reliance on the (single) realization of the data. This feature is not a bad one per se, we just need to be aware of the limitations. From comments made on the other post regarding this, I gathered that a more concrete example can help push this point across.

Detecting bubbles in real time

Blog, Finance and Trading, R, Statistics and EconometricsPosted on 04/14/2014

Recently, we hear a lot about a housing bubble forming in UK. Would be great if we would have a formal test for identifying a bubble evolving in real time, I am not familiar with any such test. However, we can still do something in order to help us gauge if what we are seeing is indeed a bubbly process, which is bound to end badly.

Advances in post-model-selection inference

PCA as regression

On the nonfarm payroll number

Eplot (1)

R vs MATLAB (Round 3)

Mom, are we bear yet?

Non-linear beta

Bias vs. Consistency

Bootstrap Critisim (example)

Detecting bubbles in real time

Bootstrap criticism

My favourite statistician

R vs MATLAB – round 2

R vs Matlab (round 1)

Don’t believe anything you read