I started this project to better understand which players were generating offense, but after hundreds of games and mountains of data, it also became apparent that tracking passing data revealed a bit more about how goals are scored. This makes perfect sense from a logical standpoint: simply by passing the puck, a goaltender has to move, set, and assess the situation all in a matter of seconds, sometimes even less than that. It quickly became apparent to me that making the goalie work harder, process more information about what’s happening in front of him, was the key to elevating the level of danger in a shot attempt.
Now, concurrent with my project, the magnificent site, War-on-Ice, revealed how they categorized shot attempts by danger zone. Obviously, the closer you are to goal, the more dangerous it is. No one would dispute that, but what occurs prior to the final shot location is vital to determining the added level of danger. Add up all of these factors and you have what people commonly call "shot quality."
Everyone believes shot quality exists: a shot closer to the goal is more likely to go in, a goalie being screened is more likely to let in a shot, a shot that changes direction is more difficult to stop, and so on. The trouble arises when one tries to take all of these factors and produce a predictive and reliable metric to account for shot quality. War-on-Ice has done this to an extent in its scoring chance data. In fact, they demonstrated that a player’s prior scoring chance percentage was a better predictor of future goal percentage. This is a way to better account for location and level of danger than using Corsi.
Expected Goals models are another way to account for shot quality. Weighting shot type, location, distance, shot sequence, etc. on the probability of each event resulting in a goal is nothing new to hockey or sports analytics. Alan Ryder was doing it ten years ago and we're all just doing different variations of his work. However, if we can find improvements in shot-based metrics that are more widely used, I think that has greater appeal, i.e. War-on-Ice’s scoring chance numbers. It’s a simple metric that offers great value. Often the simplest models or metrics can offer the biggest value if only because everyone knows how they’re constructed.
I feel there’s still a lot of work to be done by my group in order to better answer these questions, but this article reveals some of the "work-in-progress" going on behind the scenes. I present it to get feedback and suggestions. Some of the more intriguing papers or presentations I’ve read or seen were about thinking about different ways to analyze the game. It’s okay to have a presentation based around "no." Or a paper that wasn’t as paramount as the author had originally hoped, but still offers new ways of thinking about how to analyze performance. An example of the former would be Micah Blake McCurdy’s presentation from DC. An example of the latter would be Sam Ventura’s paper on Zone Transition Times he presented on last November.
With new data, I feel you want to illustrate how to either explain something that’s happening on the ice or predict goal-scoring. At the end of the day, we want to find out how to score goals or understand how specific phases of the game impact shot totals (Zone Entries would be a good example of this). So, here’s the Passing Project’s first attempt at doing just that. All non-passing data was pulled from War-on-Ice.
What I’ve done is split the data we have for the forwards and defensemen on the six teams we tracked over the course of the full season (Chicago Blackhawks, Florida Panthers, New Jersey Devils, New York Islanders, New York Rangers, and Washington Capitals). I’ve only included players that had at least 300 minutes of ice time in both halves of the season. This gives me fifty-three forwards and thirty-two defensemen. This post will discuss the forwards as I’m still slogging through numbers for the defense. It’s not a large sample size in terms of players, but this is the data we have from last season, so we’re going to take a look and see where there is signal and what to revisit as we add data for the coming season. Below is a glossary if you’re unfamiliar with terms.
CC/60, or Corsi Contributions per sixty minutes: This is a player’s total offensive contributions in the form of individual shot attempts, primary passes leading to shot attempts, and secondary passes leading to shot attempts.
PCC/60, or Primary Corsi Contributions per sixty minutes: The same thing only without secondary passes.
GF/60,or Goals For per sixty minutes: The rate at which the team scores goals with that player on the ice.
Ppoints/60, or Primary Points per sixty minutes: A player’s goals and primary assists.
D/NZ SAG/60, or Primary Passes made in transition that lead to shot attempts per sixty minutes: This tracks all shot attempts that were generated by a pass made in either the defensive or neutral zone - "in transition."
Composite SAG/60, or Total Passes leading to shot attempts per sixty minutes: This is CC/60 without the player’s individual shot attempts, a total passing contribution rate.
So, in starting with the forwards, we see that the rate at which a player contributes to shot attempts through his own attempts and those he set up with primary or secondary passes will better predict the rate at which goals are scored with him on the ice than anything else. Moreover, this is a highly repeatable metric. Towards the right of the graph we see how well Corsi For and Scoring Chances For predict Goals For within this sample of players. Now, sample size is likely playing into some of these numbers, so here's how the shot metrics predict goals using the same parameters (300 minutes either half of the season), but now we're including all forwards from last season that met that criteria (272).
When we increase the number of players, we see Scoring Chances and Corsi go to the front of the class. Would passing metrics see a similar boost in their correlations if we had a larger sample size? We'll know this season!
Going back to the first chart, we see a 5% increase from just primary shot attempt contributions to total contributions. Is the inclusion of secondary passes really that important? Well, you may have seen this chart in some of my other pieces recently, but I will post it again as I think it helps explain why this is the case.
Goals are scored at a higher rate from multiple passing sequences than those with only a single pass or those with no pass at all. So, my guess is that is why total contributions are more important than just primary ones.
Now, let’s look at predicting shooting percentage, or trying to answer, "How well can Teammates Raise On-Ice Shooting Percentage?" David Johnson of puckalytics and stats.hockeyanalysis wrote up a post on this not too long ago. Let’s see how the passing metrics perform.
Well, the range is a bit closer from the top-five metrics (TOI% represents the percentage of ice time a player receives), but we do see a player’s total passing rate slightly ahead of on-ice shooting percentage. The big difference is in how measureable and repeatable the two metrics are. This also follows logically from the previous chart on shooting percentage by sequence. Shot metrics fall off bit here, and only do marginally better when we look at the first and second halves of last season using all forwards (272). The r-squared values are all around the same (0.05).
So, while there isn’t enough data to definitively say, "yes, we should use these metrics to better predict goals and shooting percentage," I believe there is enough of a signal and logical sense that we can at least say, "maybe." This is definitely going to be revisited as we accumulate more and more data, but I feel it’s good to often stop and look for signs of improvement on what we use already. If we’re not improving on that, then there’s no need to do this, but I think we are, bit-by-bit.
This post is not to be confused with trying to replace Corsi or other shot metrics, so don’t even start that up. Corsi is wonderful metric in that is a powerful and simple. What this project is about is filling in the gaps between player shot attempts and on-ice shot attempts and quantifying both that and how goals can be scored. If anything, this project is about enhancing Corsi.
What questions do you have? What would you like to see answered in a future post? I’ll dig into the defensemen as soon as I have time and get that to all of you as well. Thanks for reading.