## Publication in Significance – code

Couple of months ago I published a paper in Significance – couple of pages describing the essence of deep learning algorithms, and why they are so popular. I got a few requests for the code which generated the figures in that paper. This weekend I reviewed my code and was content to see that I used a pseudorandom numbers, with a seed (as oppose to completely random numbers; without a seed). So now the figures are exactly reproducible. The actual code to produce the figures, and the figures themselves (e.g. for teaching purposes) are provided below.

## R tips and tricks, on-screen colors

I like using for many reasons. Two of those are (1) easy integration with almost whichever software you can think of, and (2) for its graphical powers. Color-wise, I dare to assume you probably plotted, re-specified your colors, plotted again, and iterated until you found what works for your specific chart. Here you can find modern visualization so you are able to quickly find the colors you look for, and to quickly see how it looks on screen. See below for quick demo.

## R tips and tricks – Paste a plot from R to a word file

In this post you will learn how to properly paste an R plot\chart\image to a word file. There are few typical problems that occur when people try to do that. Below you can find a simple, clean and repeatable solution.

## Understanding Variance Explained in PCA

Principal component analysis (PCA) is one of the earliest multivariate techniques. Yet not only it survived but it is arguably the most common way of reducing the dimension of multivariate data, with countless applications in almost all sciences.

Mathematically, PCA is performed via linear algebra functions called eigen decomposition or singular value decomposition. By now almost nobody cares how it is computed. Implementing PCA is as easy as pie nowadays- like many other numerical procedures really, from a drag-and-drop interfaces to `prcomp` in R or `from sklearn.decomposition import PCA` in Python. So implementing PCA is not the trouble, but some vigilance is nonetheless required to understand the output.

This post is about understanding the concept of variance explained. With the risk of sounding condescending, I suspect many new-generation statisticians/data-scientists simply echo what is often cited online: “the first principal component explains the bulk of the movement in the overall data” without any deep understanding. What does “explains the bulk of the movement in the overall data” mean exactly, actually?

## Matrix-style screensaver in R

This post shares short code snippet to make your own screen saver in R, The Matrix-style:

## Visualizing Time series Data

This post has two goals. I hope to make you think about your graphics, and think about the future of data-visualization. An example is given using some simulated time series data. A very quick read.

## R tips and tricks – boxplots for large data

Admit it, you always thought there is something off with how boxplot look like. You can tell there should be some way in which more information can be depicted, they simply look much too spacious. Evidently you are not the only one. Many have tried to suggest better ways to plot the same information. Here on 40 years of boxplots.

## Visualizing Tail Risk

Tail risk conventionally refers to the risk of a large and sharp draw down of the portfolio. How large is subjective and depends on how you define what is a tail.

A lot of research is directed towards having a good estimate of the tail risk. Some fairly new research also now indicates that investors perceive tail risk to be a stand-alone risk to be compensated for, rather than bundled together with the usual variability of the portfolio. So this risk now gets even more attention.

## Introduction

Google “K-means clustering”, and you usually you find ugly explanations and math-heavy sensational formulas*. It is my opinion that you can only understand those explanations if you don’t need them; meaning you are already familiar with the topic. Therefore, this is a more gentle introduction to K-means clustering. Here you will find out what K-Means Clustering, an algorithm, actually does. You will get only the basics, but in this particular topic, the extensions are not wildely different.

## R tips and tricks – the locator function

How many times have you placed the legend in R plot to discover it is being overrun by some points or lines in the chart? Usually what comes next is a trial-and-error phase where you adjust the location, changing the arguments of the x and y coordinates, and re-drawing the plot again to check if the legend or text are now positioned such that they are fully readable.