One of my colleagues sent me an article in the Financial Times from March 17, 2017 entitled “How to save a penalty: the truth about football’s toughest shot. On star goalie Diego Alves, game theory and the science of the spot kick.” I found the article interesting for two reasons.

- It has a fun discussion of the psychology and game theory of taking penalty kicks. It points to the paper by Ignacio Palacios-Huerta in which he shows that professional soccer players take penalties in a way that is consistent with Nash equilibrium (or minmax) behavior. The FT article also includes an interesting interview with Ignacio Palacios-Huerta and his “analysis of ideal penalty-taking strategies for the then Chelsea manager Avram Grant before the Champions League final against Manchester United in 2008.”
- The FT article highlights Diego Alves, Valencia’s goalkeeper, and argues that he is particularly good at stopping penalties. The FT article argues that Diego Alves’ stopping record (he stopped 22 of 46 penalties – a very high number compared to the average stopping rate of 25% of all goalkeepers combined) cannot be explained by chance alone.

In this blog post I want to comment on the 2^{nd} point. It is actually wrong. And it is wrong for an interesting reason. Moreover the mistake made is very easy to make and is a very common one.

So how does the analysis in the FT work? We are interested in testing the null hypothesis that Diego Alves’ true stopping probability is (at most) 25%. If this null hypothesis is true the probability of observing Diego Alves stopping 22 (or more) shots out of 46 is given by the binomial formula and can be calculated to be 0.0676%. Statistically minded readers will know this as the p-value (associated with the given one-sided null hypothesis).

As this probability is very small (much smaller than the commonly used 5% cut-off), the FT article then claims that the null hypothesis must be wrong and that Diego Alves really has a true stopping rate that is higher than 25%.

But this is not necessarily so. So what is wrong? In this analysis we forget that it was no accident that we chose to look at Diego Alves. Why did the FT look at Alves? I guess the only reason for this choice is that he has made an unusually high proportion of stops. So if Diego Alves had not made such a high number of stops, but someone else had we would have looked at this someone else. In other words the FT article is not about Diego Alves, it is about the goalkeeper with the highest proportion of stops. This person just so happens to be Diego Alves.

So what does this mean? It means that we have to take into account that we are looking at the highest empirical stopping rate of about 400 goalkeepers. The FT has a nice graph looking at penalties faced (on the x-axis) and penalties stopped (on the y-axis) for many goalkeepers from the top leagues. I roughly counted (estimated) that this graph has about 400 dots (i.e. 400 goalkeepers).

Now I am going to make a mistake myself. I am now going to make the empirically clearly wrong assumption that all of these 400 goalkeepers have faced exactly 46 shots. I do this, so I can make my point fairly simply and quickly. If I had the full data I could do this correctly. But I think it is sufficient for the point I want to make.

Suppose therefore that we have 400 goalkeepers who each have faced 46 shots. What is then the likelihood that the best of them has stopped 22 (or more) of these 46 shots if the true stopping rate is 25%? This can be calculated as 1 minus the probability that all of them make less than 22 stops. This is given by 1-(1-0.000676)^400. This probability is 23,7131%.

To summarize, if the true stopping rate of all goalkeepers is 25% and if all 400 goalkeepers face 46 penalty shots, then the probability that the best of them stops 22 or more of these 46 penalties is 23,71%. That is not sufficiently improbable (it is for instance higher than the usual 5% cut-off) to make me abandon the null hypothesis that all goalkeepers (including Diego Alves) have the same 25% stopping rate. In other words, I do not believe that there is anything particularly special about Diega Alves’ penalty kick stopping ability. We could do another test after he has faced another 46 penalties. I doubt he will save another 22 of these.