Boundary corrected kernel density

Density estimation is now a trivial one-liner script in all modern software. What is not so easy is to become comfortable with the result, how well is is my density estimated? we rarely know. One reason is the lack of ground-truth. Density estimation falls under unsupervised learning, we don’t actually observe the actual underlying truth. Another reason is that the theory around density estimation is seldom useful for the particular case you have at hand, which means that trial-and-error is a requisite.

Standard kernel density estimation is by far the most popular way for density estimation. However, it is biased around the edges of the support. In this post I show what does this bias imply, and while not the only way, a simple way to correct for this bias. Practically, you could present density curves which makes sense, rather than apologizing (as I often did) for your estimate making less sense around the edges of the chart; that is, when you use a standard software implementation.

More

Day of the week and the cross-section of returns

I just finished reading an interesting paper by Justin Birru titled: “Day of the week and the cross-section of returns” (reference below). The story is much too simple to be true, but it looks to be so. In fact, I would probably altogether skip it without the highly ranked Journal of Financial Economics stamp of approval. However, by the end of the paper I was as convinced as one can be without actually running the analysis.

More

Bitcoin investing

Bitcoin is a cryptocurrency created in 2008. I have never belonged with team “gets it” when it comes to Bitcoin investing, but perhaps time has come to reconsider.

More

Correlation and correlation structure (1); quantile regression

Given a constant speed, time and distance are fully correlated. Provide me with the one, and I’ll give you the other. When two variables have nothing to do with each other, we say that they are not correlated.

You wish that would be the end of it. But it is not so. As it is, things are perilously more complicated. By far the most familiar correlation concept is the Pearson’s correlation. Pearson’s correlation coefficient checks for linear dependence. Because of it, we say it is a parametric measure. It can return an actual zero even when the two variables are fully dependent on each other (link to cool chart).

More

Fed Fund Rate futures curve and what they tell us

“The Fed is certainly moving forward with plans to normalize interest rates.” We keep on hearing that, we believed it in the past and we believe it now. We believe that the Fed believes and that, in fact, this means something.

Should we become more suspicious and less trusting given history? Let’s take a look.

More