The yearly R in Finance conference is one of my favorites:
The R language has improved over the years. Amidst numerous splendid augmentations, the
magrittr package by Stefan Milton Bache allows us to write more readable code. It uses an ingenious piping convention which will be explained shortly. This post talks about when to use those pipes, and when to avoid using pipes in your code. I am all about
that bass readability, but I am also about speed. Use the pipe operator, but watch the tradeoff.
Admit it, you always thought there is something off with how boxplot look like. You can tell there should be some way in which more information can be depicted, they simply look much too spacious. Evidently you are not the only one. Many have tried to suggest better ways to plot the same information. Here on 40 years of boxplots.
How many times have you placed the legend in R plot to discover it is being overrun by some points or lines in the chart? Usually what comes next is a trial-and-error phase where you adjust the location, changing the arguments of the x and y coordinates, and re-drawing the plot again to check if the legend or text are now positioned such that they are fully readable.
This is more an Rstudio tip than an R tip. It would be nice to know how the following works for different editors, but Rstudio is common enough and awesome enough for the following to be relevant.
Insert or bind?
This is the first in a series of planned posts, sharing some R tips and tricks. I hope to cover topics which are not easily found elsewhere. This post has to do with loops in R. There are two ways to save values when looping:
1. You can predefine a vector and fill it, or
2. you can recursively bind the values.
Which one is faster?
Few weeks back I gave a talk in the R/Finance 2016 conference, about forecast combinations in R. Here are the slides:
Package Eplot on cran.
At least for me, R by faR. MATLAB has its own way of doing things, which to be honest can probably be defended from many angles. Here are few examples for not so subtle differences between R and MATLAB:
Recently, we hear a lot about a housing bubble forming in UK. Would be great if we would have a formal test for identifying a bubble evolving in real time, I am not familiar with any such test. However, we can still do something in order to help us gauge if what we are seeing is indeed a bubbly process, which is bound to end badly.
Matlab has it this time, with solid 3D plotting capabilities.
When you are busy with a lengthy project, like writing a paper, you create many objects along the way. Every time you log into the project, you need to remember what is what. In the past, each new working session I used to rerun the script anew and follow what each line is doing until I get back the objects I need and continue working. Apart from helping you remember what you are doing, it is very useful for reproducibility, at least given your data, in the sense that you are sure nothing is overrun using the console and it is all there. Those days are over.
Roughly speaking, Multicollinearity occurs when two or more regressors are highly correlated. As with heteroskedasticity, students often know what does it mean, how to detect it and are taught how to cope with it, but not why is it so. From Wikipedia: “In this situation (Multicollinearity) the coefficient estimates may change erratically in response to small changes in the model or the data.” The Wikipedia entry continues to discuss detection, implications and remedies. Here I try to provide the intuition.
If you have 10 possible independent regressors, and none of which matter, you have a good chance to find at least one is important.