Way back in November 2009, Tom Awad at Puck Prospectus wrote one of the most important articles for hockey stats: A Castle Built on Sand. To summarize, Awad pointed out how the data being used to generate what was then the forefront of hockey analytics was not as reliable. Scorer bias was real and it varied from rink to rink. While much has advanced in analytics and many have been hired from it - including Awad - the castles are still built on sand. Data validation remains an issue with the National Hockey League.
As you may or may not know, I am currently involved in a data tracking project. I am collecting various data points surrounding icing events in every single New Jersey Devils game last year. As of this writing, I’ve gone through 312 events. This is not a post to give you any preliminary results; the tracking comes first. But I have to share this: Not all 312 events were legitimate icing calls. In fact, there were five icing events that were not at all for icing plays. Five instances where it was clear to anyone at the rink, watching on TV, or even just listening to the broadcaster announcer that there was no icing - but the scorer at that game called it icing. Five instances of either a puck leaving the rink or an offside - and the scorer still recorded it as icing. Five instances where the puck went nowhere near the goal line and was not going to be, which is specified in Rule 81 for icing (link goes to a PDF of the rulebook), and yet it was listed as icing on the play by play log.
I understand the scorers are human, but that is a rather significant thing to miss. Especially since actual calls for icing usually result in a twenty to thirty second delay, a defensive zone faceoff for the team that iced the puck, and that team not allowed to change their players. It is not quite as obvious as a goal, but it is up there. Yet, the scorers made those errors - and they were not corrected for something as rather easily identifiable as icing. I fear I will find more and while it may not mean much in the larger picture, it casts further doubt on the data the NHL currently collects. What is more frequent are shooting attempts left unrecorded and faceoff wins being awarded by different means (some scorers think it is by where the puck goes after the draw, others are doing it based on who possess the puck first). These are not just errors in reporting on their own. There are larger implications.
If you go through the archives at this very site, you may have noticed I tend to use a lot of information from the various hockey analytics sites from over the years. Most of those sites are now gone as most of their owners were hired by teams: Time on Ice (Vic Ferrari, a.k.a. Tim Barnes), Behind the Net (Gabe Desjardens), Extra Skater (Darryl Metcalf), and War on Ice (Andrew Thomas, Sam Ventura, Alexandra Mandrycky) to name a few. Most recently, Calgary hired David Johnson of Hockey Analysis, which was previously the longest-running active database and was a key source for With or Without You charts. All of those sites plus Corsica (hopefully returning in the Fall, depending on Manny Elk) and others were reliant on scripts that scraped data from each NHL game. The play-by-play log, which I usually link in my recaps (example here from a bright spot last season), has more data within its code than what is seen. Through calculation, that is how you get Corsi, Fenwick, PDO, expected goals for and against, and so forth. This is not just for a database of hockey stats, but also for models such as GAR and Weighted Points Above Replacement, among others. Of course, much analysis, opinion, and commentary has been and will be given based on those resources.
In other words: the hockey analytics is based on data collected by the NHL. Which we know to be suspect.
To be frank, this is not a new problem. Awad’s article in 2009 specifically highlighted shots and shot locations as being off, which might as well be true today. Especially since shot location partially drives the expected goals model as well as counting scoring chances for players. As for shots, well, Awad correctly noted in 2009 that the Devils have under-counted shots for years and years. Further, I’ve written about scorer bias at length as far back as 2010 for real time super stats (hits, blocks, etc.). If this is not a new issue, what’s different now other than learning that the scorers can record icings that did not happen?
Well, for starters, with Hockey Analysis now gone, there is a renewed call for NHL.com to improve their stats page. Example from Satchel Price, who writes things at the NHL side of SBNation and may or may not be aware of this blog:
It's still incredible to me that the NHL can't be the one to provide fans with a proper advanced statistics resource.— Satchel Price (@SatchelPrice) August 3, 2017
The NHL does offer some basics. For what it does have, I think it does well. I do not fully agree that people should avoid it entirely. I do not fully agree that NHL.com should work to make it better. Yes, they have the resources and stability that no hobbyist can offer. I question whether they could stay up to date with new developments in stats or respond to feedback or questions like a hobbyist could (and did). The hobbyist would also I also question whether the effort involved would be in the best interest of the NHL or NHL.com to be a comprehensive source for all NHL stats. Even if they decide that it is, the real first step to get to where they want to go is what the league should be pushing for instead: securing accurate data collection.
Allow me to admit some ignorance. I do not know enough about the system scorers use to record events. I do not know what training they go through or if there are any qualifications to be a scorer. I do not know if corrections are allowed or if event logs are reviewed. I do not know if scorers are given common definitions for events to track. I do not know what means the scorers are given to track the games, whether it is live viewing, live viewing on video, or something else. I do not know if teams have multiple scorers (I think New Jersey does), and what is done to make sure the information is consistent from scorer to scorer. What I do know is that the current method is not working. Scorer bias is rife and there are plain errors in on the event log.
What I will guess as a way to get to a solution is that there needs to be some kind of validation project for recording events in NHL games. This means ensuring that scorers are trained to use a comprehensive and consistent procedure for event recording. This means establishing a solid definition of events so at least everyone is on the same page. This means testing scorers to ensure that a run of play with, say, four shooting attempts will be counted as four shooting attempts regardless of whether the game is in Newark, Manhattan, or Nashville. This means some kind of review process to confirm that the events recorded truly did happen as recorded. With a human process, there will always be room for some error. But through process control, training, and feedback, it can be improved.
I am hopeful that advances in technology can help out to make the process less human. I do not know whatever came about of it, but back in 2015, the NHL announced that they were working with Sportvision for player and puck tracking in real-time. They implemented it for the 2015 All-Star Game and it was used again at the 2016 World Cup of Hockey. In theory, this could address the issues with shot location and provide additional data regarding shots, pressure, and more. This would be big. It would also be expensive to implement. This Greg Wyshynski article at Puck Daddy from January 2016 followed up on Sportvision and it brings up the many challenges with implementation from cameras in arenas to chips in pucks, all of which coming with a significant cost. That the NHL and Sportvision re-united later in 2016 is a hopeful sign; but, as far as I know, things have been quiet on that front since then. It is an open question whether the data that would come from that would be public. Nevertheless, it certainly could be utilized in conjunction with other events that the technology cannot record. The concept is sound; it comes down to implementation and usage - whether it is with Sportvision or someone else.
Unfortunately, there is not a whole lot the hobbyist, like myself, can really do other than do it ourselves. This is what was done for scoring chances years ago. However, that was a huge undertaking then. It required trust that whoever did it knew what a scoring chance was and could record it accurately. Doing so for more or all events than shots in dangerous locations is not really feasible. It takes a lot of time to just track icings - assuming they are actual icings; an entire game would just be too much to keep up with. Especially to provide the data in a timely manner. There may be a more elegant solution but I will have to leave that to more elegant minds.
Even more unfortunately, we must press on with what we have. I think some optimism is in order. With every “big” hockey stats site going dark, a new one has emerged. I think the void will be filled in someway because while there may not be enough demand for the NHL to fill it, there is among the hobbyists who want to know more about the game and further the cutting edge of analytics. Awad wrote back in 2009 that while the data is suspect, it should not deter one from the use of the stats from it. It made sense then and I think it still holds today. The opinion, analysis, and commentary from the stats of today is based on what is available and known. When there is new information, then we can make new opinions, analyses, and commentaries. While flawed, it certainly has not held back any use or growth of stats in the NHL.
Plus, for all we know, the general conclusions may remain the same even with more accurate data. It may only be the details that change. And it may not change that much. Here is a hypothetical example. Miles Wood truly being around a 44-45% CF% (or, in the opposite way, a 38-39% CF%) player instead of the 41.2% CF% he was recorded by Natural Stat Trick from last season does not change the general point that Wood was sturdy as balsa in the run of play in 2016-17. The conclusion that Wood has to be utilized better and/or he needs to be better in 5-on-5 hockey remains even though the datapoint may shift. Now, if the shift is larger, then the conclusion must change - such as it is.
Regardless, through my own tracking and what I have observed among the hockey analytics over the years, Awad’s essay from 2009 remains as relevant as ever. And it is not an easy problem. It may prove to be too much of a challenge, too much of an “inside problem,” for the hockey stats community - whoever they are. But the reduction (or removal) of doubt in the data collected would only serve to strengthen the analytics we see today. Garbage in usually only results in garbage out; but that is not a fault of the process or the model. It does not make the suspect data any more suspect. Therefore, the call really should be for improvements in data collection and not stats being reported - although both could certainly still happen.
Until then: the castles are still on sand.