clock menu more-arrow no yes

Filed under:

How to Criticize a Statistical Model: The Nico Hischier Example

New, comments

Impact models don't capture everything. But are they right about Nico Hischier's defense?

NHL: New Jersey Devils at Philadelphia Flyers Eric Hartline-USA TODAY Sports

First, it’s important to illustrate the disconnect here. When he was drafted, Nico Hischier said he would model his game after all-time great two-way player, Pavel Datsyuk — and that defensive potential was cited by Shero as a reason for selecting him. People have viewed him through that lens his whole career, and it persists to this day. In 2020, Last Word on Sports said “there’s a chance a Selke could be in his future” and Corey Masisak of The Athletic listed said that the expectations heading into this season were “60 point [pace] as a fringe Selke Trophy contender”. And, sure enough, Lindy Ruff and the Devils coaching staff relied on him to be that player. They named him captain before he’d played a game for them and immediately handed him the keys to the penalty kill, the #1 faceoff man, and the center spot for the more defensive of the two top lines.

Management immediately saw Nico as a potential defensive star. Writers have continued to label him as such over his career. And the current staff has given him responsibilities that indicate they believe it as well.

The only problem is, it’s not immediately clear that this defensive savant is preventing opponents from creating scoring chances. According to Evolving-Hockey’s player cards, Hischier’s defensive impact percentiles over his 4 seasons have been: 5th, 66th, 43rd, and 40th. Over the past two seasons, not only has he not been Selke-caliber, but the numbers say he’s been below-average defensively. And it’s not too hard to believe that in looking at the numbers. According to NaturalStatTrick, of 15 forwards to have played 500+ minutes for the Devils over the last two seasons, Nico ranks 11th in terms of goals allowed (per 60), 12th in terms of shots and scoring chances allowed, and 14th in high-danger chances allowed.

Micah McCurdy’s Hockeyviz uses information from prior seasons as well as the circumstances of a players shift usage in order to estimate their current isolated impact on expected goals against. For Nico, his presence on the ice is expected to increase, the opponents scoring rate by about 0.12 goals per hour. That means that opponents are expected to score 5% more than league-average in a given situation when Nico is on the ice.

What is happening here? Why are the number showing such a drastically different picture than what the public perception is? In what’s left of this piece, I want to explain specific features of how these models work in ways that may impact the conclusions you just saw.

What Are Legitimate Criticisms of “Impact” Models?

Some of the most well known impact models include Evolving-Hockey’s RAPM, TopDownHockey’s RAPM, and Hockeyviz’s Impact Model “Magnus”. These models all determine the value of a players shifts in terms of what happened in terms of goals/shots versus what we would’ve expected given the score, venue, rest, zone, teammates, and competition. So when people say “that doesn’t account for how easy/hard their job is” — that’s not true, generally speaking. But, there are things that the model can miss. And it helps to understand what they do, so we can understand what they don’t do.

1) “Impact” models quantify player impact implicitly, not explicitly

All of the modes use “on-ice” metrics exclusively. You may think of a players “impact” as the value — in terms of goals, shots, etc. — of what they do on the ice. But it’s important to recognize that these models do not measure what a player does, they only measure what happened. So they moreso represent the value of a player’s “shifts” than the value of a player’s “performance.” While counterintuitive, this is actually by design. We don’t care what a player does unless that thing will help produce an outcome we care about. So, rather than measuring the things they do, we measure the outcome. How they produce that outcome is not always clear.

So the take here, according to most public models, would be as follows: “Whatever Nico is doing on the ice, it seems to be producing worse defensive results than we’d have expected given the circumstances of his shifts”.

2) The models attribute “credit” based on what happens in shifts. Their guess about who deserves credit will improve over time as it learns.

Another reason people can sometimes not to like these models, is that they are, by definition, guesses. They are very good guesses and informed by a lot of supporting data, but they are still guesses. It’s not like counting goals or shots — each modeler would agree on the counts of those, but they all have slightly different outputs for impacts. The reason is that they allocate “credit” differently. There are 5 skaters on both sides of the ice, so there is no one measurement describing a given player’s impact. Luckily, in hockey, players log a lot of minutes with a lot of different teammates and opponents so we can see what else they’ve done this year and determine who was likely the cause for the results.

If a player seems to continuously outperform expectations then they will be seen as more likely to be the cause of a good shift than a player with less consistently positive results. As such, the models interpretation for what “happened” in a given game will actually change as new information is provided. It’s not like a goal or a shot where one player objectively did it. Everyone could have impacted the outcomes, and out best guess evolves with new data.

If a player was bad for 10 games and then has a great one, the models might give credit to his linemates for helping buoy him. But if his next 10 games also turn out to be good, it will go back and correct it — the logic being “maybe he was better than we thought at the time!”

The models evolve. And in doing so, they will improve. But they will still be best guesses at who deserves credit for the results that we observe.

3) All impact models are different

It’s a pet peeve of many public analytics writers/modelers etc. when someone says “the analytics love _______” — particularly if that take is not unanimous. This is very common, and I’ve probably done it myself. But it’s worth pointing out that every model is done in a different way. These models are machines that spit out the “guess” I was talking about earlier after you feed something in. The only things they have in common are the output and the options for what to input. They can choose different inputs from what’s available, and they can put them into different machines in different ways.

Some include “priors” which inform the “guess” based on what players have done in previous seasons and what they’re likely to do next. Some adjust to new information faster than others. Some assume rookies will be replacement-level, or league-average, or something else entirely.

Understanding which models do what can help you isolate any differences between them. Sometimes.

An Example and How Models See It

The score is 2-1 Devils with 10 minutes left in the 3rd period of an home game against the Bruins. The Bruins played last night and the Devils are on 2 days rest. Bruce Cassidy sends out his top line for an OZ faceoff against Nico Hischier, Pavel Zacha, and Jesper Bratt. Bergeron wins the faceoff, they possess for about 35 seconds and eventually produce 2 low-danger shots, both of which Blackwood stops. Severson collects the 2nd rebound and passes to Nico who exits the DZ, enters the OZ, does a lap around the net allowing his linemates to head to the bench before passing to Jack Hughes, who is just starting his shift. Nico then gets off the ice, completing the line change.

How do models view this in terms of analyzing Nico’s defense impact?

What the models see:

They see that Nico was put in a situation where his team was up 1 goal, in the 3rd period, in the defensive zone - all of which will make shots against very likely. They also see that he’s playing against one of the best lines in the league. Maybe, in situations like this, you’d expect 3 shots against.

That’s a tough situation, but it is a little less tough since he had more rest and was playing at home. Perhaps those benefits reduce the shot expectation by 1, making the cumulative expectation 2 shots against. The model needs to, now, attribute credit for those 2 shots against to the Devils on the ice. Maybe it knows Bratt’s season has gone well defensively and Zacha’s has gone poorly so it sees them as cancelling out, leaving Nico with a net impact of 0. That will change if we learn later that Zacha or Bratt (or Nico) are consistently producing results different than the ones they were at this time. In some models, it also knows what those players have done in past seasons (Evolving-Hockey’s RAPM does not do this, it knows nothing about Zacha/Bratt until they play games that season).

What the models don’t see:

The model works on the assumption that all players react to circumstances in roughly the same way.

It doesn’t know if the Boston line is better than league-average in the offensive zone. It, of course, knows that they are great overall, and that the offensive zone makes all lines better, but it doesn’t know if this line is “more better” than expected. Maybe the average line goes from 0 to +1 in the OZ, but these guys, due to their masterful OZ chemistry, go from a +1 to a +3. Similar;y, it doesn’t know if Zacha and Bratt are worse than expected in the DZ. An impact model assumes that a defensive zone faceoff is equally harmful to every player. Same goes for score, venue, rest, etc.

Another thing these models don’t do is use the individual performance of a player to inform their “guess” about who deserves credit. It doesn’t know Severson got the rebound, or Nico exited the zone, or that Bergeron won the faceoff. It just assumes that, if those players do those things a lot, eventually the model will discover the magnitude of the impact those types of plays have.

The model also doesn’t know that the shift ended in a more advantageous position than it began. It measures “success” only in terms of the offensive and defensive count of a given metric (typically shots, xGs, or goals). So, the fact that Nico personally carried the puck through the neutral zone and queued up the Hughes line for a productive offensive shift holds no value outside the fact that it prevented the opponent from scoring while we had the puck — any shots the Hughes line produces are credited to them, not Nico.

Lastly, the models don’t know a lot of specifics about the players. It doesn’t know their height, weight, draft position; if they are injured, sick, having a baby, getting divorced; and it often doesn’t even know the specific position they play (most treat all 5 skaters similarly or identically). They won’t know Nico was injured twice and on the COVID list once, that his team is nowhere near playoff contention, that one of his frequent linemates (Zacha) was playing a new position, or that another frequent linemate (Johnsson) had was experiencing symptoms of COVID through the season due to his asthma.

So, now that we know how a model views things, what do we think it’s actually doing when it determines Nico’s impact?

How to Question Nico Hischier’s Impact Outputs

First of all, here are the outputs from 3 common impact models in terms of Nico’s impact on expected goals against (higher = worse).

TDH is a 3-year model, HV is a 1-year model with priors, EH is an isolated 1-year model

All of the models agree that Nico Hischier is a negative impact defensively. But the two that use previous seasons (TDH and HV) believe he’s more of a negative impact. So, if you are of the opinion that Nico Hischier has recently substantially improved his defensive game, then you could say that those outputs are penalizing him for his past.

Conversely, the single-season model may not sufficiently describe how difficult Nico’s job was. His most common forward linemates all have historically worse impacts than they appeared to have this season.

This might indicate a misallocation of credit to these players instead of Nico.

Another reason that credit may be misallocated is the false assumption that all players react to all circumstances identically. After Nico’s return, Jesper Bratt’s offensive zone start percentage went from 61.5% to 48.0% and Zacha’s went from 44.0% to 39.4%. It is reasonable to say that those players see larger-than-normal decreases in production when they start in the defensive zone and are disproportionately effective in the offensive zone. Therefore, when they are used more in the DZ with Nico, they will get results “worse than expected” and Nico will catch the analytical blame due to the inflexibility of the zone impacts.

And the final critique is a simple lack of good data. This year — the only season under the new staff — Nico Hischier played less than 50% of an already-shortened season, all which were during the super-condensed post-COVID time of the year. Since these models don’t allocate credit for things a player individually does, you may be seeing a defensive improvement in Nico earlier than the model can detect it.

These are all possible confounding factors in Nico’s defensive impact metrics. Now, what is the overall conclusion after having accounted for these things.

Is Nico Hischier Bad Defensively?

To be honest, I don’t see all of these possible confounding factors as making enough of a cumulative impact to drag Nico anywhere close to the “future Selke” status he’s been given by other analysts.

I do think that there is probably some misallocation of blame/credit surrounding him and his linemates given the complications of the weird/short season, the unusual usage of his linemates when paired with him, and the atypically positive defensive impacts they appeared to have.

It’s impossible to know how much, though. And it’s difficult to say we should account for the poor defensive history of his linemates while ignoring his own poor defensive history. The needle in need of threading here is “Nico became good defensively very recently, but his linemates were always poor defensively, therefore only their priors are relevant.” That’s an ambitiously specific claim.

In terms of his defensive zone play, in particular, it seems unlikely to me that he’s significantly better than these impacts portray. Of the 15 Devils forwards that had at least 20 minutes of defensive zone shift time, Nico is last (by far) in scoring chance ratio according to PuckIQ (they call it “Dangerous Fenwick”).

I believe Nico Hischier is likely an average defensive center at this point in his career — perhaps slightly better than the impact models show, but still nowhere close to Selke-level defensive value. All of this can become obsolete if he can translate his raw defensive skill into actual on-ice defensive results next season. And I hope he does. In the meantime, allow this article to serve as a blueprint on how to criticize the models. Whether or not you should do that will be up to you to decide.

Thanks for reading and leave your thoughts in the comments below!