The good thing about using open-source software is the community around it. There are very many R packages online, and recently CRAN package download logs were released. This means we can have a look at the number of downloads for each package, so to get a good feel for their relative popularity. I pulled the log files from the server and checked a few packages which are known to be related to machine learning. With this post you can see which are the community favorites, and get a feel for the R-software trend growth.
Especially in economics/econometrics, modellers do not believe their models reflect reality as it is. No, the yield curve does NOT follow a three factor Nelson-Siegel model, the relation between a stock and its underlying factors is NOT linear, and volatility does NOT follow a Garch(1,1) process, nor Garch(?,?) for that matter. We simply look at the world, and try to find an apt description of what we see.
Open source software has many virtues. Being free is not the least of which. However, open source comes with “ABSOLUTELY NO WARRANTY” and with no power comes no responsibility (I wonder..). Since no one is paying, by definition it is your sole responsibility to make sure the code does what it is supposed to be doing. Thus, looking “under the hood” of a function written by someone else is can be of service. There are more reasons to examine the actual underlying code.
In April this year, Rstudio notified early users of shiny that Glimmer and Spark servers which host interactive-applications would be decommissioned. Basically, the company is moving forward to generate revenues from this great interactive application service. For us aspirants who use the service strictly as a hobby, that means, in a word: pay.
Basic subscription now costs around 40$ per month. Keeping your applications free of charge is possible BUT, as long as it is not used for more than 25 hours per month. So if your site generate some traffic, most users would simply not be able to access the app. Apart from that, you are subject to some built-in Rstudio’s logo which can’t be removed without having a paid subscription. That is a shame, but a company’s gotta eat right? I am using Rstudio’s services from their very beginning, and the company definitely deserve to eat! only I wish there would be another step between the monthly 0$ option which provides too slim capabilities, and the monthly 40$ option which is, in my admittedly biased opinion, too pricey for a ‘sometimes’ hobby.
Diversity is a real strength. By now it is common knowledge. I often see institutions openly encourage multinational environment and multidisciplinary professionals, with specific “on-the-job” training to tailor for own needs. No one knows a lot about a lot, so bringing different together enhance independent thinking and knowledge available to the organization. Clarity of communication then becomes even more important, and making sure your figures are quickly understandable goes a long way.
One of my Ph.D papers was published recently. It deals with yield curve forecasting.
Here is the code for applying the Nelson-Siegel model to any yield curve.
Package Eplot on cran.
At least for me, R by faR. MATLAB has its own way of doing things, which to be honest can probably be defended from many angles. Here are few examples for not so subtle differences between R and MATLAB:
R takes it. I prefer coding in R over MATLAB. I feel R understands that I do not like to type too much. A few examples:
In the last few decades there has been tremendous progress in the realm of volatility estimation. A major step is the additional use of intraday price path. It has been shown that estimates which consider intraday information are more accurate. Which is to say they converge faster to the real unobserved value of the true volatility.
The summary function in R returns:
Min. 1st Qu. Median Mean 3rd Qu. Max.
9.14 10.70 11.10 11.30 12.10 13.60
For the univariate case I wrote what I consider to be a better summary function which returns:
usum(x) # For univariate Summary
min med mean max sd skew kurt
1 9.14 11.13 11.35 13.65 1.057 0.3028 -0.6389
No NA's in the series
13.65 13.55 13.08 13.13