Few weeks back I simulated a model and made the point that in practice, the difference between Bayesian and Frequentist is not large. Here I apply the code to some real data; a model for Industrial Production (IP).

## Stocks with upside potential

THIS IS NOT INVESTMENT ADVICE. ACTING BASED ON THIS POST MAY, AND IN ALL PROBABILITY WILL, CAUSE MONETARY LOSS.

Quantile regression is now established as an important econometric tool. Unlike mean regression (OLS), the target is not the mean given x but some quantile given x. You can use it to find stocks that present good upside potential. You may think it has to do with the beta of a stock, but the beta is OLS-related, and is symmetric. High-beta stock rewards with an upside swing if the market spikes but symmetrically, you can suffer a large draw-down when the market drops. This is not an upside potential.

## Bayesian vs. Frequentist in Practice

Rivers of ink have been spilled over the ‘Bayesian vs. Frequentist’ dispute. Most of us were trained as Frequentists. Probably because the computational power needed for Bayesian analysis was not around when the syllabus of your statistical/econometric courses was formed. In this age of tablets and fast internet connection, your training does not matter much, you can easily transform between the two approaches, engaging the right webpages/communities. I will not talk about the ideological differences between the two, or which approach is more appealing and why. Larry Wasserman already gave an excellent review.

## Understanding Multicollinearity

Roughly speaking, Multicollinearity occurs when two or more regressors are highly correlated. As with heteroskedasticity, students often know what does it mean, how to detect it and are taught how to cope with it, but not *why* is it so. From Wikipedia: “In this situation (Multicollinearity) the coefficient estimates may change erratically in response to small changes in the model or the data.” The Wikipedia entry continues to discuss detection, implications and remedies. Here I try to provide the intuition.

## How Important is Variable Selection?

Very.

If you have 10 possible independent regressors, and *none of which matter*, you have a good chance to find at least one is important.

## Design quotes

*Slides 18 and 30 are especially nice:*

## Quantify your jogging

Numbers are useful (I think we can all agree on that..). If you own a smart phone, you can install this runmeter app. When you run, you can take the smartphone with you and activate this app to collect interesting numbers like distance, pace, fastest pace, heart rate*, calories etc. Now we can load the statistics collected over the past months into R and have a quantified look at the progress.

## R and Dropbox

When you woRk, you probably have a set of useful functions/packages you constantly use. For example, I often use the excellent quantmod package, and the nice multi.sapply function. You want your tools loaded when R session fires.

## Moving Average Representation of VAR

A vector autoregression (VAR) process can be represented in a couple of ways. The usual form is as follows:

## Multivariate summary function in R

Some time ago, I wrote a Better summary function in R . Here is its multivariate extension:

## Quantile Autoregression in R

In the past, I wrote about robust regression. This is an important tool which handles outliers in the data. Roger Koenker is a substantial contributor in this area. His website is full of useful information and code so visit when you have time for it. The paper which drew my attention is “Quantile Autoregression” found under his research tab, it is a significant extension to the time series domain. Here you will find short demonstration for stuff you can do with quantile autoregression in R.

## On p-value

Albert Schweitzer said: “Example is not the main thing in influencing others. It is the only thing.”, so I start with it.

## Heteroskedasticity tests

Assume you have a variable *y*, which has an expectation and a variance. The expectation is often modeled using linear regression so that E(y) equals, on average, $\beta_0 +\beta_1x$. The origin of the variability in *y* is the residual. Now, standard econometric courses start with the simple notion of “constant variance”, which means that the variance of the disturbances is steady and is not related to any of the explanatory variables that were chosen to model the expectation, this is called homoskedasticity assumption. In fact, in real life it is rarely the case. Courses should start with the heteroskedasticity assumption as this is the prevalent state of the world. In almost any situation you will encounter, the variance of the dependent variable is not constant, it matters what is the *x* for which we want to determine the variance of *y*.

## Live Correlation plot, shiny improvement.

Open CPU is a great project. Few months back, I wrote a function for plotting a moving window of the market average correlation. Jeroen C.L. Ooms was nice enough to upload it to their server. Something is now changed. Quotes now return as a character class, as oppose to numeric. This messes up the function and the plot does not renders. I don’t wish to disturb Jeroen C.L. Ooms again with the correction for the code (despite his kind replies in the past). This problem creates the opportunity to look at the glistening “Shiny” package. I used it to (quickly..) build an app for the plot. You can now view a live correlation plot with the moving window of your choice. Live, as the app requests **current **market data. The width of the window for correlation calculation is given as an input parameter.

## A Simple Model for Realized Volatility

The post has two goals:

**(1)** Explain how to forecast volatility using a simple Heterogeneous Auto-Regressive (HAR) model. (Corsi, 2002)

**(2)** Check if higher moments like Skewness and Kurtosis add forecast value to this model.