Each year I supervise several data-science master’s students, and each year I find myself repeating the same advises. Situation has worsen since students started (mis)using GPT models. I therefore have written this blog post to highlight few important, and often overlooked, aspects of thesis-writing. Many points apply also to writing in general.
Category: MiscTips
Rython tips and tricks – Clipboard
For whatever reason, clipboard functionalities from Rython are under-utilized. One utility function for reversing backslashes is found here. This post demonstrates how you can use the clipboard to circumvent saving and loading files. It’s convenient for when you just want the quick insight or visual, rather than a full-blown replicable process.
From Excel to Script? Add Browsing Points
“If it ain’t broke, don’t fix it”.
Rython tips and tricks – Snippets
R or Python? who cares! Which editor? now that’s a different story.
I like Rstudio for many reasons. Outside the personal, Rstudio allows you to write both R + Python = Rython in the same script. Apart from that, the editor’s level of complexity is well-balanced, not functionality-overkill like some, nor too simplistic like some others. This post shares how to save time with snippets (easy in Rstudio). Snippets save time by reducing the amount of typing required, it’s the most convenient way to program copy-pasting into the machine’s memory.
In addition to the useful built-ins snippets provided by Rstudio like lib
or fun
for R and imp
or def
for Python, you can write your own snippets. Below are a couple I wrote myself that you might find helpful. But first we start with how to use snippets.
On Writing Math
There are a lot of examples for skills that despite being greatly needed, we never get any formal training for. At least nothing is built into our core educational programs. Few examples are: how to read well, how to listen well, or how to develop your can-do mental attitude. Writing well, in particular math-writing, is another such example. Here I share few pointers from my own experience of reading and writing math.
R tips and tricks – shell.exec
When you startup your machine, the first thing you do is to open the various programs you work with. Examples: your note-taking program, the pdf file that you need to read, the ppt file you were last working on, and of course your strongest link with the outside world nowadays; your email box. This post shows how to automate this process. Windows machines notoriously need restarting for every little (un)install. I trust you will find this startup automation advice handy.
R tips and tricks – Timing and profiling code
Modern statistical methods use simulations; generating different scenarios and repeating those thousands of times over. Therefore, even trivial operations burden computational speed.
In the words of my favorite statistician Bradley Efron:
“There is some sort of law working here, whereby statistical methodology always expands to strain the current limits of computation.”
In addition to the need for faster computation, the richness of open-source ecosystem means that you often encounter different functions doing the same thing, sometimes even under the same name. This post explains how to measure the computational efficacy of a function so you know which one to use, with a couple of actual examples for reducing computational time.
R + Python = Rython
Enough! Enough with that pointless R versus Python debate. I find it almost as pointless as the Bayesian vs Frequentist “dispute”. I advocate here what I advocated there (“..don’t be a Bayesian, nor be a Frequenist, be opportunist“).
Nowadays even marginally tedious computation is being sent to faster, minimum-overhead languages like C++. So it’s mainly syntax administration we insist to insist on. What does it matter if we have this:
1 2 3 |
xsquare <- function(x){ x^2 } |
Or that
1 2 3 4 |
def xsquare(x): return x**2 |
R tips and tricks, on-screen colors
I like using for many reasons. Two of those are (1) easy integration with almost whichever software you can think of, and (2) for its graphical powers. Color-wise, I dare to assume you probably plotted, re-specified your colors, plotted again, and iterated until you found what works for your specific chart. Here you can find modern visualization so you are able to quickly find the colors you look for, and to quickly see how it looks on screen. See below for quick demo.
Matrix-style screensaver in R
This post shares short code snippet to make your own screen saver in R, The Matrix-style:
Visualizing Time series Data
This post has two goals. I hope to make you think about your graphics, and think about the future of data-visualization. An example is given using some simulated time series data. A very quick read.
R tips and tricks – the assign() function
The R language has some quirks compared to other languages. One thing which you need to constantly watch for when moving to- or from R, is that R starts its indexing at one, while almost all other languages start indexing at zero, which takes some getting used to. Another quirk is the explicit need for clarity when modifying a variable, compared with other languages.
Take python for example, but I think it looks the same in most common languages:
R tips and tricks – the locator function
How many times have you placed the legend in R plot to discover it is being overrun by some points or lines in the chart? Usually what comes next is a trial-and-error phase where you adjust the location, changing the arguments of the x and y coordinates, and re-drawing the plot again to check if the legend or text are now positioned such that they are fully readable.
Dark background theme for reading
I am picking up on Rob Hyndman’s suggestion on dark themes for writing. I carried on with a bit of “internet-scientific” reading. Opinions on ‘dark vs white’ background themes, which is better for your eyes, are mixed. You are busy, so just the bottom line: do what works for you.