This post concerns a paper I came across checking the nominations for best paper published in International Journal of Forecasting (IJF) for 2012-2013. The paper bears the annoyingly irresistible title: “The illusion of predictability: How regression statistics mislead experts”, and was written by Soyer Emre and Robin Hogarth (henceforth S&H). The paper resonates another paper published in “Psychological review” (1973), by Daniel Kahneman and Amos Tversky: “On the psychology of prediction”. Despite the fact that S&H do not cite the 1973 paper, I find it highly related.
At least for me, R by faR. MATLAB has its own way of doing things, which to be honest can probably be defended from many angles. Here are few examples for not so subtle differences between R and MATLAB:
We are all standing on the shoulders of giants. Bradley Efron is one such giant. With the invention of the bootstrap in 1979 and later with his very influential 2004 paper about the Least Angle Regression (and the accompanied software written in R).
I just finished reading An estimate of the science-wise false discovery rate and application to the top medical literature. The authors ask how many of what we read is scientific journals is actually incorrect, or false.
Presenting properly is important. Here is how I think it should look like,
Slides 18 and 30 are especially nice:
Numbers are useful (I think we can all agree on that..). If you own a smart phone, you can install this runmeter app. When you run, you can take the smartphone with you and activate this app to collect interesting numbers like distance, pace, fastest pace, heart rate*, calories etc. Now we can load the statistics collected over the past months into R and have a quantified look at the progress.
Few days ago I dropped my iPhone and cracked it. Though the iPhone still works, I decided it will be good to have a backup for my contact on my desktop. Fancy backup can be achieved in the following two step procedure: first synching your contacts information with facebook, and second, sending yourself an excel file with full details of your mobile contacts, phone number, date of birth, home page, work address and other details extracted from their facebook page. The process takes only few minutes and is free.
Do doctors unnecessarily prolong Colonoscopy? the answer is: they surely might.
It seems like a very long while since my bachelor. Checking my bookshelf the other day I was thinking to flag some of those books which helped or inspired me along the way. Here they are in no particular order.
For those of you who are into machine learning, here you can find a cool collection of databases to play around with your favorite algorithm. I choose one out of the available 200 and fit a logistic regression model. The idea is to see what kind of properties are common for those who earn above 50K a year. Our data is such that the “y” variable is binary. A value of 1 is given if the individual earns above 50K and 0 if below. We know many things about the individual. Level of education in years, age, is she married, where from, which sector is she working in, how many working hours per week, race, and more. We can fit logistic regression, which is quite standard for a binary dependent variable, and see which variables are important.