When you google “Kurtosis”, you encounter many formulas to help you calculate it, talk about how this measure is used to evaluate the “peakedness” of your data, maybe some other measures to help you do so, maybe all of a sudden a side step towards Skewness, and how both Skewness and Kurtosis are higher moments of the distribution. This is all very true, but maybe you just want to understand what does Kurtosis mean and how to interpret this measure. Similarly to the way you interpret standard deviation (the average distance from the average). Here I take a shot at giving a more intuitive interpretation.
It is true that Kurtosis is used to evaluate the “peakedness” of your data, but so what? what do you care Peaky or not Peaky? you are not picky. The answer is that you would like to know where does your variance come from. If your data is not peaky, the variance is distributed throughout, if your data is peaky, you have little variance close to the “center” and the origin of the variance is from the “sides” or what we call tails.
Calculating means and the variances provides you important information about the your data, namely:
- Where is it? (mean)
- How scattered is it? (variance)
You can get more, you can also get a clue about the cause for the variance. In what way is your data scattered? Might be that your data has high standard deviation, yet the distribution is relatively flat, with just a handful of observations in the tails. That is why you want to take a look at the Kurtosis measure.
The next figure presents three simulated known distributions, Uniform, Normal and Laplace.
The Uniform distribution has the highest Standard deviation (4.26 for this simulation), it is the most scattered one, but the lowest Kurtosis, (-1.2) since the variance is relatively “equally distributed”, the Laplace one has the highest Kurtosis, since the variance is most scattered, low portion of the variance comes from the center, that is the peakedness referred to earlier, and large portion of the variance comes from the tails. Code for the simulation and graph is below, thanks for reading.
Kurtosis here is Excess Kurtosis, add 3 to get the actual Kurtosis.
library(VGAM) # for the laplace
lap = rlaplace(5000, location=0, scale=1)
uni = runif(5000, min(lap),max(lap))
nor = rnorm(5000)
denlap = density(lap);dennor = density(nor); denuni = density(uni )
lwd1 = 3
plot(denlap, lwd = lwd1, zer = F, main = "Where does your variance come from?",
xlab = "Simulated Normal (red), Laplace (Black) and Uniform (green)")
lines(dennor, col = 2, lwd = lwd1)
lines(denuni, col = 3, lwd = lwd1)
legend("topleft", c(paste("Kurtosis = ",round(kurtosis(nor),digits = 2)),
paste("Kurtosis = ",round(kurtosis(uni),digits = 2)),
paste("Kurtosis = ",round(kurtosis(lap), digits = 2))),
col = c(2,3,1), lty = 1, lwd = lwd1, text.col = c(2,3,1), bty = "n")