In these politically charged times, the day after election day in the US, I thought it would be valuable to write up something to serve as a bit of an olive branch between such polarized parties. I, of course, am talking about the divide between the classical hockey community and the “analytics community”.
For those who don’t know who I am, I think it’s important to explain how I perceive my story because it will set the tone for why I believe this piece is worth writing. I apologize in advance if it sounds narcissistic to autobiographically vamp, but I think it’ll pay off to know the source of the recommendations I’m about to give and from where they are motivated.
I have been blogging for this site, All About the Jersey, since 2014 after winning an “audition” process (along with Gerard and Alex) held by our fearless leader, John Fischer. As a math teacher with degree in Stats, I naturally came at writing from an numbers perspective, but I only had a casual understanding of modern hockey analytics. At that stage, I’d have considered myself an analytics fan. After a few years of regularly contributing that sort of content, I learned more and more while reading research done on sites like Hockey-Graphs until I basically ran out of new stats and research to read up on and felt confident enough to start generating my own material. Having worked on this blog with Ryan Stimson (creator of The Passing Project), my first interest was in presenting that, which was being tracked by Corey Sznajder (who had made his data available for something insane like $5 on Patreon). Using this, I posted my first ever Tableau visualization on Twitter.
A few people whose work I had read for years like Alison Lukan, Josh Garik, and Stephen Burtch retweeted my viz. Encouraged by the positive feedback, I created another viz with the goal of being able to compare the sum of players contributions in the data Corey tracked. At the time I had about 300 followers, and that tweet got over 100 likes, including praise from a lot of the Hockey-Graphs staff like Sean Tierney, EvolvingWild (Luke and Josh Younggren), and Dom Luszczyszyn. I don’t say this to brag — I doubt anyone reading this would be impressed by that last sentence anyway. I say this to point out that I knew NONE of these people. And despite my anonymity, folks like the EvolvingWild twins, Prashanth Iyer, and Matthew Barlowe all took time to give me feedback on my work. I found “the analytics community” — insofar as that term is well-defined — to be extremely welcoming and extremely transparent. Since then, I’ve done independent research on scoring chances, created NHL age curve, developed network-based NHL equivalency model, and presented at multiple analytics conferences. Rather than being an analytics fan, I now consider myself a member of that community. None of this happens if John doesn’t give me a shot at this blog, and if those other writers and researchers don’t advance my work. This isn’t a story about me being an impressive person — if you know me, you’ve probably never had that impression — it’s a story about this community being interested in almost nothing more than cultivating and circulating new ideas and fresh voices.
My personal experience with this community has led me to be somewhat discouraged by the perception I feel it has. There is some overlap in analytics communities between sports and so I caught wind of a statement ESPN’s Bomani Jones made about the NBA’s analytics community that expresses a sentiment I believe is reflected in our own.
this happens, in large part, because the "analytics community" does a terrible job explaining itself. https://t.co/QpoxyuQVHD— bomani (@bomani_jones) November 2, 2020
I see these kind of takes so often, that it must hardly seem controversial to those who say these things. Bomani is talking mostly about just explaining metrics, but the skepticism and occasional vitriol with which those in the traditional sports platforms view the analytics community is palpable. They believe the community to be opaque, condescending, indifferent to communication, and seeking nothing but proving how smart they are.
The reason for my long preamble is to express that, to me — someone who was able to learn about analytics from reading great writings of the community circa-2014, was accepted to a blog as a 20-year-old with no writing experience, was retweeted by writers at the pre-eminent hockey analytics blog on the internet, got help from experts in their free time, and is now cited on metahockey’s repository each of the past 4 years — this notion seems completely detached from reality. My personal (read: biased) experience is that of a transparent, welcoming, modest community. But clearly, many disagree with me. So, I want to know how to fix this. How can we mend the broken bridge of communication between these communities?
To find some answers to this, I asked people who have stood on this bridge with one foot on both sides. Alison Lukan has written data-driven story telling for The Athletic with the main concern of using data to tell stories for all. Dimitri Filipovic founded the PDOcast — a podcast about hockey analytics — aimed at making the metrics approachable. Ryan Stimson is an All About the Jersey alum who founded the Passing Project, but also has coached and written about tactics. Rachel Doerrie is a former video analyst for the Devils and also co-hosted the “Staff and Graph” podcast. I asked all of them what these communities need to do in order to close the gap. I’ll share with you some of the common denominators here.
~ ~ ~ ~ ~
I’m not going to force you to listen to me extol the virtues of analytics here because it’s not that kind of piece. Briefly, I believe analytics is, most importantly, a tool for overcoming biases. There is no such thing as a perfectly unbiased metric, but at least using numbers allows you to overcome some of the common prejudices. The movie Moneyball actually ages fairly well in this respect — Ex: Chad Bradford threw funny but efficiently, Scott Hatteberg was a catcher with a bum arm that got on base a ton, and Kevin Youkilis was a chubby, but valuable hitter. It’s a good way of checking to make sure you haven’t been sold a bag of beans by your own eyes. I’ve talked in this space before about the value of consilience — concurring viewpoints arrived at in different ways. There’s a line in the movie where Billy Beane (Brad Pitt) is discussing player with a low batting average. Scouts like his swing and deem him a “good hitter”, to which Beane replies “If he’s a good hitter than why doesn’t he hit good?” If the numbers are a radical departure from your assessment, it’s work looking at the thing more critically.
So, given that this somewhere that analytics can genuinely be helpful, it would behoove those who study it to make it accessible to others — those whom it helps. So, what is already done pretty well by the community in that respect? According to Stimson, quite a lot:
“I think communication is great. I think analysts are very open about their work and provide extensive writeup and code for their decisions.” - Ryan Stimson
With regards to transparency, I believe he’s right. Check out the source pages for a site like MoneyPuck, or the public work of folks like the Younggrens (ex: WAR, xG), Micah McCurdy (ex: everything), and Manny Perry (ex: WAR, game prediction). If you want to read up on EXACTLY how these statistics have been calculated, and the results of the models, they are available for all to see — most of them either entirely without charge. But, according to some of those who spent more time in front offices, it’s not enough to be transparent, you have to be accessible. When I asked for advice on how to communicate, Doerrie said “Be simple and concise - no one cares about a giant write up. Keep it to one page.” This brings us to a more profound piece of advice — make it clear how your content can be seen on the ice;
“Relate it to gameplay. Control entry allowed % means the D are doing xyz which leads to scoring chances”. - Rachel Doerrie
This is, I think, an extremely important piece of information and part of why I got into visualizing Corey’s data as one of my first endeavors. A metric is going to be so, SO much more convincing and illuminating if you can point to something that impacts it in the game film or live on a broadcast. In a sport in which the score changes so seldom, knowing what types of events are valuable (ex: Controlled zone entries) can enhance enjoyment of the sport. This is part of why I reached out to Filipovic. If you follow him on Twitter you probably see him occasionally post highlights/montages of random players. But they’re not random if you know your analytics. For instance, Mark Stone is debatably the most valuable player in the NHL outside McDavid on a per-minute basis over the past half-decade despite not having that reputation among classical hockey fans. Part of this is high performance in “Impact” metrics like “RAPM” which isolate how much more his team is generating/preventing when he is on the ice. That’s a really abstract idea though. So, Filipovic showed us a montage of him abusing opponents with stick-lifts — these plays are a visible example of what Stone does well that prevents a chance from the opponent and creates one going the opposite way. It’s a play any fan recognizes as “good”.
“Being able to visualize it so it goes from theoretical to practical is almost [as] important [as verbiage]. Which is why being able to identify defensive plays and highlight them really helps people come around to why someone like Mark Stone has the underlying numbers and shot impacts that he does.” - Dimitri Filipovic
I think these are all some great points on how analytics community members can help their work to assimilate into mainstream hockey culture. In talking with all of these people, though (as some of my analytically skeptical Twitter followers) there’s also a less intellectual component of the journey towards analytical acceptance. It’s more of a bedrock fact about effective communication — simply, be nice. I asked Doerrie about what the analytics community does really poorly and she told me exactly where they can get out of their own way.
“The main discredit is they want everything to be absolute....Like you don’t know everything. You can never lose your feel for the game. Maybe the star feels more comfortable and plays better with some s****y D. So there’s an opportunity cost. You have to overspend or hand out NTC to get free agents or you lose them. That’s the business. The cockiness the analytics community carries themselves with is a huge issue.” — Rachel Doerrie
There is a tendency among those who study analytics to assume that anyone who disagrees with them — including players and coaches who make this their full-time job — is merely a victim of a faulty, biased, eye test because numbers never lie. Doerrie, went on to say that about 85% of the blame in the communication gap goes towards the analytics community. You’ll notice that that leaves out 15%. And, for what it’s worth, my other interviewees thought that number could be a bit higher. When I asked Stimson about where he believes the responsibility lies for the disconnect, he was a little less willing to give the traditional hockey establishment a pass:
“Yup. I think many of them [commentators, coaches, etc.] don’t want to take the time to learn something, even if all that means is read an article or ask a few questions.” - Ryan Stimson
And this brings me to my last point — it’s not only on the analytics community. Part of the reason for the story of my life I gave you in the beginning is to demonstrate that the members of this community want nothing more than to tell you about their work and talk about what it might mean. They love hockey — why else would they spend hundreds/thousands (yes) hours generating this content? There are massive new innovations every year in the public analytics sphere. On the off chance that these people aren’t all collectively useless, why not read about what their up to? Is it not possible that they might discover something useful or interesting? Lukan split the difference and talked about how it’s on both parties to understand each other:
Ultimately, the responsibility for effective communication lies on both sides, but media needs to seek out information if it doesn’t understand something. There are people who do it and use data effectively - like Mike P Johnson. If there’s a new stick, do you not seek to learn what’s new about it and what’s effective? Same goes here.
I think that analogy is brilliant. There can sometimes be a double standard in what is expected of researchers, modelers, etc. I was watching MNF and I heard Louis Riddick talking about Cover 2 Man Under without explaining to any members of the audience what that is. He also talked about how Leonard Fournette was entering the game because he was the Bucs new “Nickel back” — a term normally used for a defensive player. Now, I happen to know what he meant in both cases because I watch way too much football. But, surely, a casual fan would have either been confused by or just ignored that. Do we get on Riddick for not making the tactics approachable?
No. And we shouldn’t! He is making the broadcast more enjoyable for people who know what he’s talking about. Is it not possible, though, that there is some segment of sports fans that would think a broadcast with advanced metrics were more enjoyable? Don’t we want as many entry points for as many types of people as possible?
So, here’s my advice. Practice some empathy. Hockey People, imagine for a moment, that the Analytics People are just fans who love the sport and want to contribute and perhaps even have something valuable to offer. Analytics People, imagine for a moment, that the Hockey People may occasionally have experienced enough of the sport to possess an intuition that can fill in some of the many gaps in our imperfect understanding of what matters. Hockey People, remember that you’re humans and, like the rest of us, you are flawed and biased. Analytics People remember that players are humans and they do live on this planet when their not a node in your algorithm. Hockey people, be open to allowing for new types of analyses. Analytics people, don’t claim that all other analyses are extinct. Everything is changing, but there’s no reason we can’t change together.