10.14.2011

Statistics in Hockey

In a post to HOTH, a Daniel Wagner marshals Kierkegaard[1] against the specter of statistical analyses of sports, hockey in particular. Life, we are told in bold, Hegelian-inspired non-sequiturs, is too complex for the reductive lens of empirical models. Without an Archimedean point, the whole edifice of objective knowledge is a farce. Life is lived, or chosen, or something. To be fair, he is making a point about what is left out of statistics. I think this is important enough to respond.

I do not mean to denigrate Kierkegaard. He is without a doubt one on the most captivating writers of the 19th century. Nor do I wish to suggest I have no sympathy for the existentialist celebration of choice and contingency. But in making this point, which has little to do with contemporary social science or statistics[2], Wagner inadvertently demonstrates why we invented and use things like statistics and social scientific methods.

To be clear, I am not a statistician. I spent several years studying political philosophy in a social science department devoted to science. Therefore I have a somewhat tortured relationship to the harder sciences. And while as an unemployed philosopher I would like to believe that my knowledge of Kierkegaard is more valuable than say Nate Silver’s mastery of meta-statistics, this is not true when we are trying to answer most questions. Here, I hope to lay out the necessity of statistical or scientific methods, while pointing out where an effective critique can be leveled. In other words, both sides in this hockey stats debate are doing it wrong.

Why do we have statistics?

We should strive towards objectivity.[3] Let’s take that as a maxim. If you do not care about whether you are making true statements about the world, then this debate is moot. You can go ahead and pretend that Scott Gomez or Brian Campbell are not overpaid because they have some ineffable “something.”

To say “Scott Gomez is overpaid” is debatable. We should argue about interesting or important questions. How do adjudicate between different standards for evaluating the player? We offer reasons. Now, I can observe a player—let’s use Dan Carcillo—and say that one night he’s taking bad penalties and hurting his team with bad backchecking. You, my reader, might observe the same game and say that his penalties were important in creating energy and motivating a stagnant or disinterested team. Who is right? Notice, both positions are based on watching one player in one game. And indeed the scope of our question makes this set of observations appropriate. It is reasonable to use our personal observations to justify a judgment. And, indeed, we can then argue over those judgments.

However, what if we want to know whether Dan Carcillo is a valuable player over the course of the season? Over five seasons? Or if players of Carcillo’s type are as valuable over their careers as NHL general managers seem to think? In these cases, observations based on one game are not the best way to answer the question. And, worse, it is probably impossible for one person to watch enough hockey to answer the last question, regarding all players in all games (even restricting this to the games played since the lockout). Notice, the question here is not how complicated life is, but how — given that complexity — how can we ground our judgments in actual facts.

Statistics is a way of increasing our observations in order to make better judgments or answer more complicated questions. Increasing our observations allows us to be more objective because we are no longer restricted to the things we have directly experienced. When Wagner argues for a more encompassing measure of hockey performance, this is merely proving that we need the sorts of tools that social science provides.

What’s Wrong with our assumptions?

Wagner is correct that statistics is inherently limited. It cannot (and does not try) to describe life itself. This can appear as a deficiency if you think our goal is to comprehend the whole of existence in one go. If you understand the project of knowledge as progressive, iterative, and long-term, the problem is a lot less acute.

Regardless, practitioners need to be clear about the limits to their methods.

  1. Is the data accurate? A blogger is spending his time recording scoring chances for the Flyers. Great. But when I use this data, I am limited by his accuracy in coding those chances, and his judgment in deciding what counts as a chance. If there are only twelve chances for the Flyers in one game and he mistakenly codes two chances, that is a pretty high margin of error. I have never tried to code zone starts and ends (for Corsi scores) but I imagine they are also difficult to get right. Statisticians will correctly argue that this problem dissipates as we get more data. But I have not seen enough discussion about data reliability to take them seriously yet.

  2. Statistics are limited by what it is possible to measure. We cannot really measure “effort” or “grit” or “hockey smarts.”[4] That is the biggest reason why we do not see statisticians talking about “effort.” It is not that they do not appreciate it, but that it is impossible to quantify. Further, measurements are mapped onto models that describe how we think a factor is important. We cannot directly measure what we mean by “driving the play.” So we measure the things we can observe, add them together into a what we think describes “driving the play.” Does the addition of these data points add up to what we think it does? That’s debatable and I think most honest statisticians will admit there’s some slippage between the question and the model.

  3. Wagner seems to be most annoyed by the inflated claims to predictive power made by statistics. In many ways, this conceit is a result of imitating the hard sciences, like physics, where repeatability (and thus prediction of future results) is the hallmark of scientific progress. In the social sciences, things are less clear. To Wagner, we look at career batting average and dismiss a player’s offseason conditioning. This would be a shame if it turned out that players routinely defy expectations. I suspect the outliers are few. Nevertheless, statistics should never be taken as deterministic. They are, or should be, primarily descriptive. To the extent that scientists, having drawn conclusions, then point to future expectations, this is considered by the scientific community to be an added feature that improves testability of the hypothesis. In other words, if the prediction does not come true, we have reason to doubt the model or the data or the researcher. Can Wagner point to any systemic errors revealed by these mistaken predictions? Or, really, any specific predictions at all? To be clear, this is a real problem with anti-statistics screeds. They rarely mobilize compelling, specific counter examples to demonstrate how statistical models misrepresent hockey.

  4. A final, hockey specific point. Football and baseball are, in my mind, iterative games. Each down or at bat is a new data point. Thus each game is effectively divided into dozens of specific observations about pitch count, runners on base, etc. This gives an enormous advantage to modeling those interactions compared to a fluid game like hockey. Once the puck drops it is effectively chaos on the ice until the whistle blows. That is a much different kind of game to try and model statistically. Not that it cannot be done, but it is much more complicated to extract relevant data to answer specific questions. I’m thinking of shot counts: is a player scoring because he’s good, or because he takes a lot of shots? Is he accurate because he is getting better passes? Off the top of my head, I cannot think of any way to effectively model the interactions of shot count and accuracy without ignoring some pretty obviously important stuff that is hard to measure in hockey. (I’m sure smarter people can improve and critique this example.)

To sum up: As a social scientist, I understand that “intangibles” matter. This is a straw man critique of statistics. No one denies that there are facts that fit uncomfortably under the scientific gaze. I also understand that my own personal horizon of observations is insufficient to say anything truthful, meaningful or worthwhile about the world. Statistics is one tool for expanding my understanding of the world; a tool with its own limits and strengths.


  1. This is a poor exegesis of Kierkegaard, by the way. And if I am confused about the relevancy of Either/Or to hockey, comparing him to Hegel is not helping matters at all.  ↩

  2. Neither Kierkegaard, Hegel, nor the more recent possible example of Sartre were responding to statistics. Perhaps, Wagner might have referenced Marcuse or Horkheimer, writers who deplored the positivism of the post-War period. But even here, any discussion would have to take statistics as it is, not as critics would like to imagine it. This is a point about how we use historical figures to fight our battles for us. If I pull out Karl Popper and have him argue for or against social science, does that improve my argument? Or have I just made Popper into a puppet for my own views about hockey stats?  ↩

  3. There is a complex question indicated by the conjunction of should and objectivity. If you care about that, read a lot of Max Weber or Thomas Kuhn.  ↩

  4. Conversely, just looking at someone and saying he has “grit” does not count as a measurement either.  ↩