This post shares short code snippet to make your own screen saver in R, The Matrix-style:
The R Journal is the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R.
Christoph Weiss, Gernot Roetzer and myself have joined forces to write an R package and the accompanied paper: Forecast Combinations in R using the ForecastComb Package, which is now published in the R journal. Below you can find a few of my thoughts about the journey towards publication in the R journal, and a few words about working with a small team of three, from three different locations.
A higher-order function is a function that takes one or more functions as arguments, and\or returns a function as its result. This can be super handy in programming when you want to tilt your code towards readability and still keep it concise.
The R language has some quirks compared to other languages. One thing which you need to constantly watch for when moving to- or from R, is that R starts its indexing at one, while almost all other languages start indexing at zero, which takes some getting used to. Another quirk is the explicit need for clarity when modifying a variable, compared with other languages.
Take python for example, but I think it looks the same in most common languages:
The R language has improved over the years. Amidst numerous splendid augmentations, the
magrittr package by Stefan Milton Bache allows us to write more readable code. It uses an ingenious piping convention which will be explained shortly. This post talks about when to use those pipes, and when to avoid using pipes in your code. I am all about
that bass readability, but I am also about speed. Use the pipe operator, but watch the tradeoff.
In this post about the most popular machine learning R packages I showed the incredible- exponential growth displayed by R software, measured by the number of package downloads. Here is another graph which shows a more linear growth in R (and an impressive growth in python) as measured by % of question posted in stack overflow
This is more an Rstudio tip than an R tip. It would be nice to know how the following works for different editors, but Rstudio is common enough and awesome enough for the following to be relevant.
Insert or bind?
This is the first in a series of planned posts, sharing some R tips and tricks. I hope to cover topics which are not easily found elsewhere. This post has to do with loops in R. There are two ways to save values when looping:
1. You can predefine a vector and fill it, or
2. you can recursively bind the values.
Which one is faster?
In part 1 of Good coding practices we considered how best to code for someone else, may it be a colleague who is coming from Excel environment and is unfamiliar with scripting, a collaborator, a client or the future-you, the you few months from now. In this second part, I give some of my thoughts on how best to write functions, the do’s and dont’s.
At work, I recently spent a lot of time coding for someone else, and like anything else you do, there is much to learn from it. It also got me thinking about scripting, and how best to go about it. To me it seems that the new working generation mostly tries to escape from working with Excel, but “let’s not kid ourselves: the most widely used piece of software for statistics is Excel” (Brian D. Ripley). this quote is 15 years old almost, but Excel still has a strong hold on the industry.
Here I discuss few good coding practices. Coding for someone else is not to be taken literally here. ‘Someone else’ is not necessarily a colleague, it could just as easily be the “future you”, the you reading your code six months from now (if you are lucky to get responsive referees). Did it never happened to you that your past-self was unduly cruel to your future-self? that you went back to some old code snippets and dearly regretted not adding few comments here and there? Of course it did.
Unlike the usual metric on which “good” is usually measured by when it comes to coding: good = efficient, here the metric would be different: good = friendly. They call this literate programming. There is a fairly deep discussion about this paradigm by John D. cook (follow what he has to say if you are not yet doing it, there is something for everyone).
Especially in economics/econometrics, modellers do not believe their models reflect reality as it is. No, the yield curve does NOT follow a three factor Nelson-Siegel model, the relation between a stock and its underlying factors is NOT linear, and volatility does NOT follow a Garch(1,1) process, nor Garch(?,?) for that matter. We simply look at the world, and try to find an apt description of what we see.
In April this year, Rstudio notified early users of shiny that Glimmer and Spark servers which host interactive-applications would be decommissioned. Basically, the company is moving forward to generate revenues from this great interactive application service. For us aspirants who use the service strictly as a hobby, that means, in a word: pay.
Basic subscription now costs around 40$ per month. Keeping your applications free of charge is possible BUT, as long as it is not used for more than 25 hours per month. So if your site generate some traffic, most users would simply not be able to access the app. Apart from that, you are subject to some built-in Rstudio’s logo which can’t be removed without having a paid subscription. That is a shame, but a company’s gotta eat right? I am using Rstudio’s services from their very beginning, and the company definitely deserve to eat! only I wish there would be another step between the monthly 0$ option which provides too slim capabilities, and the monthly 40$ option which is, in my admittedly biased opinion, too pricey for a ‘sometimes’ hobby.
One of my Ph.D papers was published recently. It deals with yield curve forecasting.
Here is the code for applying the Nelson-Siegel model to any yield curve.
When you are busy with a lengthy project, like writing a paper, you create many objects along the way. Every time you log into the project, you need to remember what is what. In the past, each new working session I used to rerun the script anew and follow what each line is doing until I get back the objects I need and continue working. Apart from helping you remember what you are doing, it is very useful for reproducibility, at least given your data, in the sense that you are sure nothing is overrun using the console and it is all there. Those days are over.