Way, way, way, way back in 2010, I wrote a post detailed why I believed there was a scorer bias for the New Jersey Devils based on how specific stats differed between home and road games. This was specifically about what the NHL used to call “super stats:” blocked shots, missed shots, hits, giveaways, and takeaways. Not only did it appear that there was a scorer bias in New Jersey, the counts for those stats had more variation and significantly different means for home teams throughout the NHL compared to road teams. It was enough for me to take the “super stats” with a grain of salt along understanding that counts at the Prudential Center are going to be different from other rinks. As much as I support and utilize hockey stats, both basic and “advanced,” I do know that Tom Awad had it right at Hockey Prospects all the way back in 2009: the castles are built on sand.
However, that was way back in 2010. Much has changed since then. We saw the Devils go to the Stanley Cup Finals. We had a lockout. We saw some Russian guy retire to get out of his contract. We saw new owners for the Devils. We saw Lou leave the Devils. We saw massive turnover with players and personnel to the point where there are only a handful of people in the organization who are there now as they were eight years ago. In the world of analytics, we now have advancements to the point where there’s an expected goals model and even WAR models that incorporate multiple statistics in regression models to attempt to assign value to an individual player. (Aside: I wrote about that and my skepticism with two of them here.) To that end, I think now is a good time ever to revisit this concept of scorer bias. Is it still in place at the Prudential Center? How is it in the league? While they are not called super stats anymore, let’s look at the hits, the missed shots, the blocked shots, the giveaways, and the takeaways from the last five seasons to find out. For this re-visitation, I even included the infamously-undercounted-in-New-Jersey-for-years shots on net. There’s a lot to dive into so let’s split it up into two posts. This post will look at the hits, giveaways, and takeaways. Tomorrow’s post will look at shots: actual shots, blocked shots, and missed shots.
All numbers are from the Team Stats at Corsica, which I now recommend because A) it actually works, B) it works rather smoothly, and C) they are one of the few sites that does home and road splits or their data. Seriously, back in 2010, I only used NHL.com for the data because they did have filters for road and home data. They do not have that now. Alas, times really have changed. There’s a fourth reason I’ll get to later in this post. The numbers are pulled from all situation, regular season play without adjustments. Home, away, and total splits are accounted for in the following charts. (Note: % R-H Diff means Road-Home percent difference - negative means there’s more happening in home games, positive means there’s more on the road. I calculated that, not Corsica.)
Here is what was recorded for giveaways by the Devils in home and away games.
Nothing seems too out of sorts for giveaways on their own with the Devils between home and away games. There’s a lean towards more giveaways at home; but despite a 13% difference in this past season and 2013-14, it is not something that appears too significant. Especially since this difference was far reduced in the three seasons between them. You could say there’s a lean towards counting more at home.
Now, for comparison’s sake, let’s look at the NHL:
Woah! Home scorers throughout the league have consistently counted more giveaways to teams on average compared to when they are on the road. The difference in these means are at least 100 giveaways on average. On top of that, the standard deviation for home giveaways has been consistent and much, much higher than when those teams are road. This means there is more variation among home scorers. That suggests that giveaways are counted differently from rink to rink.
With respect to the Devils, their home count of giveaways does not seem out of step (e.g. plus-or-minus one standard deviation) with the league with the exception of being slightly below a standard deviation from the 2015-16 mean. But they do tend to be a lower count compared to the league average. When they hit the road, the scoring bias is much less. Again, in 2015-16 the Devils were at least a standard deviation below the league mean for giveaways, so we could say that maybe the Devils were a bit better in that compared to most.
However, looking at other teams just from this past season alone how wide range giveaways are counted. A great example is the team the Devils finished just ahead of for the final playoff spot in the East last season: the Florida Panthers. Per Corsica, Florida led the NHL with 1076 giveaways with 366 on the road and a whopping 710 at home. The Panthers were on the higher end of giveaways on away games, but they were #1 with a bullet in home games for giveaways. That 710 count is ridiculous. It is roughly 1.8 standard deviations above the league mean. I can understand that teams may play a little different on the road. But giveaways are something to avoid no matter where you’re playing. I do not understand how Florida honestly gave up nearly twice as many pucks in their own building compared to how they did on the road. I don’t think they really did; I think it is more likely that scorer bias is at play here.
The sad thing is that there are multiple possibilities as to why. It could be possible that Florida’s scorer is just too loose with giveaways. Maybe their definition is a broad interpretation - which may or may not be more logical and accurate - and everyone else in the league has a more narrow definition. It could be a result of input errors. The point remains, giveaway counts in Florida are quite high. Given their home giveaway counts at Corsica, I would suspect the same for Edmonton, Our Hated Rivals, Montreal, and Toronto as their counts were above the league mean plus one standard deviation.
This concern is also on the opposite of that. Whose scorers may be too stringent with their giveaway count? I would suspect the home scorers at St. Louis (with only 185 giveaways at home!!), Minnesota, Colorado, Columbus, and Arizona are remarkably stingy with respect to counting giveaways given that their giveaway counts at home are more than a standard deviation below the mean.
While the Devils may not be at either extreme; I would trust the road counts to be closer to reality than the larger variation and counts by home scorers for giveaways. And I wouldn’t trust the whole stat to tell me much at all since it does not seem to be clear what a giveaway is. The counts certainly do not reflect that it is clear. Same as it was in 2010.
So let’s consider the opposite of giveaways: takeaways. Not every giveaway has a takeaway. That makes sense. A player can just cough up a puck to the opposition without the opposing player doing something to make that happen. Maybe the counts will be more even between home and away games?
Definitely not at the Rock last season! The Devils were fairly consistent in total takeaways for four out of the last season. While you can argue that the 2017-18 team has some more talent than the others, but an increase of over 200 takeaways almost seems too good to be true. Especially when that increase is driven mostly by the takeaways counted at the Rock. Yes, road takeaways went up a bit but an increase of 20 is less out of the ordinary as a jump of almost 200 takeaways. The difference between road and home counts of takeaways from 2013-14 to 2016-17 is not that large. One could believe that both may be legitimate. No so for 2017-18. I don’t know if the scorer got a new definition or if it is a new scorer or what happened. But it appears that the Rock was home to overcounting takeaways (or they were right about what it is and everyone else is wrong - aren’t scorer bias questions fun?).
Are the Devils at home out-of-step with the NHL when it came to overall takeaway counts for home games?
They actually were not - thanks to a very large standard deviation for home takeaway counts last season. They were on the higher end, sure, but it was within a standard deviation. That said, home takeaway counts were higher in terms of both mean and standard deviation than away takeaway counts. That’s just like the giveaways for the NHL. Also like the giveaways, the away counts have a lot less variation in each of the last five seasons. The total numbers are, again, driven by home counts than the away counts. Basically, takeaways are like giveaways in that it appears scorer bias plays a larger role for the home counts.
As an example of where it may be overcounted, I give you the 2017-18 Carolina Hurricanes. Their home scorer credited their team with 652 takeaways. To put that in perspective, second in home takeaways at Corsica was Vegas with 594 and third place was San Jose with 452. To put it another way, Carolina (and Vegas) was credited for more takeaways than two standard deviations above the league mean. While we know Carolina had a good blueline and they were very good team in 5-on-5 play last season, that number is high enough to be too good to be true. On the opposite end, the home takeaway count for Los Angeles (just 136!) and Arizona (194) were more than one standard deviation below the mean. To that end, I suspect Carolina’s and Vegas’ scorers overcount takeaways whereas Los Angeles and Arizona undercount them.
It is not as extreme as it is with giveaways. There aren’t ten teams beyond a standard deviation from the mean of last season. Still, it is true that even across the last five seasons, home counts for takeaways have just been larger with more variation compared to the more consistent and less road counts. As with giveaways, I wouldn’t trust the takeaway count in general to tell me much at all since it does not seem to be clear what that is either. The counts certainly do not reflect that it is clear. And the Devils’ home rink may be on the end of counting a lot of them based on last season after four seasons of possibly reasonable counts.
OK, giveaways and takeaways are events that happen in the moment and they can be obscured and fortunes can change quickly enough that we may not know if it meant anything. A more, clear cut event like hits may be easier to count. Checking is very much a thing in this sport. And Corsica counts hits for (hits by the team) and hits against (hits taken by the team) too. Do the Devils scorers count this consistently with the road scorers?
In terms of throwing hits, the Devils appear to have undercounted them from 2013-14 to 2016-17. That changed for 2017-18; it is only a slight lean towards hits at home but it is a switch from the prior four seasons. I wouldn’t say the 2017-18 Devils were more or less physical than those teams. For starters, they were credited for fewer total hits than the past four seasons. For another, the 2015-16 team stood out for throwing out more hits on the road than the other four seasons here. I think the scorer made an adjustment on the hits for side after four seasons of being stingy with the hit counts for New Jersey.
On the hits against side, that appears to be the case, although not as much. The Devils have consistently took more hits from opponents than doling them out for the last five seasons. Mark another reason for the 2017-18 team for not being so physical. At least in 2017-18, the gap was much smaller than being out-hit by at least a hundred. It was still favoring the opponents on the road, though. That the road percent difference was at least 15% for the prior four seasons further suggests that the home scorer at the Rock was rather stingy with hits for both the Devils and their opponents. Again, I think an adjustment was made in 2017-18.
How do hits square away for the league as a whole?
As we would expect, the total hits for and against would have the same averages. Ditto for the home hits against and the away hits for; and the home hits for and the away hits against. If someone is throwing a hit, then someone is taking a hit. That said, the mean for home hits for has been consistently larger than the mean for away hits for. The opposite holds for the means for hits against. The standard deviations also differ - especially for hits against. The away hits against counts over the whole league has much less variation than the home hits against over each of the last five seasons. Other than that, the differences are not as stark as they were for the previous five seasons of giveaways or takeaways.
That being said, the Devils’ counts at home used to be lower by at least a standard deviation from the league mean for a coupe of past seasons. They were within it in 2017-18. Other teams still have signs of over and under counting hits. For counts of hits for at home, six teams exceeded the league mean by over one standard deviation according to Corsica: Edmonton (1,219), Montreal (1,201), Pittsburgh, Ottawa, Arizona, and Los Angeles. The opposite end, four teams had counts below the league mean by a standard deviation: Minnesota (just 577!), Calgary, Nashville, and San Jose. That’s ten teams where there’s reason to really suspect scorer bias for hits for in a notable way in either direction.
What about hits against? There were more teams outside of the mean by over a standard deviation in either direction. On the lower end, there’s Calgary, Minnesota, Buffalo, Columbus, and Vancouver. On the higher end, Chicago’s scorer counted the most hits against with 1,184 followed by Montreal, Boston, Carolina, Pittsburgh, and Toronto. So mark it eleven for hits against counts at home being notably higher or lower than the league mean. Again, this should raise some suspicion on what was counted. Especially since, some of these teams did make it on the hits for side: Edmonton, Montreal, and Pittsburgh on the higher end of count and Calgary and Minnesota on the lower end of the hit counts.
At least the Devils’ counts from last season were not too far away from the league mean for either hits for or hits against. In the four seasons before then, there is reason to believe that hits were undercounted in general at the Rock. As a whole, I don’t know how much value getting or taking a hit has in the larger view of all hockey events. Hits can be useful in terms of stopping offensive plays and disrupting the opposition. Hits can also be not useful as it can take a player out of the play to the team’s detriment or the hit could fail to separate the puck in a favorable situation. The scorer bias does not seem to be as huge as it is for giveaways or takeaways. But knowing that about a third of the league is considerably high or low with the home hit counts plus what it all means, I’m less convinced about whether this is something worth doing.
While there were some changes with the Devils’ home scorer in 2017-18 compared to prior seasons for hits and takeaways - with a reduction of home-away difference for hits and an massive increase of home counts for takeaways. Giveaways did not seem to change much. Still, throughout the league, there were higher averages and variations for home counts of giveaways, takeaways, and (to a lesser extent) hits for and against the home team compared with away counts for those stats. This suggests to me that scoring bias still appears to be a real thing for these stats eights years later. That has not changed. So has my hesitation for using giveaways, takeaways, and hits at all for analysis, discussion, and so forth.
This brings me to my fourth reason that I’m happy that Corsica has this information. Remember how I brought up two WAR models in the past week? One of them was by Josh and Luke Younggren and while they did not document an explanation for it, they did present it at a conference. Here’s the presentation at Hockey-Graphs. Check out Slides #22 and #23.
It appears TAKE and GIVE are takeaways and giveaways, respectively, and they are in the model for even strength and special teams. The same appears to be the case for hits, as indicated the variables iHF and iHA - which are consistent with the abbreviations at Corsica, the source of all the data used this post. Unless I have this totally wrong - and I hope I am for their sake or they have something to adjust for this somehow - this model appears to consider these stats that are impacted by scorer bias. Sure, it could be the case that shots, blocked shots (which appear to be here as iBLK), and missed shots are also impacted by scorer bias based on home and away counts. We’ll see that in Part 2. Still, this raises more doubt in terms of what in the world is resulting from a model utilizing data that has had a strong scorer bias for home teams for the last five seasons. Again, I really hope the Younggren twins have a solid rationale for this and something to account for at least some of this bias. Sure, the hockey analytic castles may be built on sand - same now as it was in 2009 and 2010 - but this WAR model in particular may be built on a swamp. (No, I’m not saying all WAR models are bad; but if it utilizes these three stats, then it needs to come correct about them).
Admittedly, seeing that helped inspire this two-part post. It was also a reminder that in terms of analysis, the simple base of how stats are even collected remains to be an issue. As I wrote then in 2010, it is not clear who can sort this out or what could be done about it. Is it the team? Does there need to be a league-wide plan? What about a third-party source? It seems clear to me that not all 31 teams were counting hits, giveaways, and takeaways the same way last season. The 30 teams were not doing so from 2013 to 2017 either - and before then as well. I fear it would be the same for shots as well - we’ll see in Part 2 tomorrow.
In the meantime, what do you make of all of this? Are you surprised that there is a scorer bias? What do you think of the Devils’ own ways of counting these stats? What can be done about them? If bias is corrected, would we want to use giveaways, takeaways, and/or hits for anything? Please leave your answers and other thoughts in the comments. Thank you for reading.