Recently, Extraskater has decided to add a new sortable stat: Setup Passes. Setup passes are defined as an "estimate of passes that result in a shot attempt." Essentially, it’s identical to what I’ve been tracking this past season and what you all know as the SAG stat: Shot Attempts Generated, the pass preceding a shot attempt.
I decided to take the time to compare Extraskater’s estimate and see how it stacks up to the real thing. Its origins are from Rob Vollman and his work at HockeyProspectus. Vollman even discusses estimating passes in last year’s Hockey Abstract. (As an aside, you can purchase Hockey Abstract on Amazon here. I recommend you do if you haven't read it). I’ll quote from these sections for those that are unfamiliar with Vollman’s thoroughly researched and well-written Hockey Abstract. In it, Vollman explains why chose assists and shooting percentage as the basis of his pass stat.
"While the NHL doesn’t currently record passes at all, let alone those that resulted in actual shots, the best way to estimate passes is by taking a player’s primary assists and dividing it by the team’s shooting percentage while that player was on the ice" (Vollman, 211). He ignores secondary assists because, like my reasoning for the SAG stat, "those didn’t directly result in the shot that resulted in the goal," which is right.
Vollman continues on the same page to argue "that the percentage of goals on which the player received a primary assist should roughly match the percentage of all shots for which he made the pass…Obviously, without actual counted data, there’s no way of knowing how accurate these estimates are."
Vollman then provides his findings: a list of the top 100 forwards based on his stat from 2007 - 2013. Many of the names at the top of the list are obvious (Sidney Crosby, Joe Thornton, Henrik Sedin, Evgeni Malkin, etc.) as those players tend to have some of the highest assist totals in the game. One name Vollman did not expect was Scott Gomez, who was between Thorton and Sedin in the number three spot. Vollman writes regarding this surprise: "Does the inclusion of Scott Gomez so high on this list mean that the formula is broken or has it discovered something new?....his inclusion does reveal a slight flaw in our method of estimating passes. His own 5.7% shooting percentage during this period is so incredibly low that it significantly brought down the team average. Being the divisor, this actually inflates the estimate of how many passes it required to reach his assist totals, meaning that he probably didn’t pass it nearly as often as we’ve estimated" (214 – 15).
Gomez’s inclusion highlights the risks of using shooting percentage in constructing any stat. Vollman, wisely, prefaces his findings with this on shooting percentage: "Shooting percentage is largely luck-driven, the extent of which depends on your sample size" (210). He cites Alexander Ovechkin’s jump from an 8.7% shooting percentage in 2010 – 2011 to his 14.5% shooting percentage in 2012 – 2013 in the lockout year as examples of this. Since his findings cover six seasons, it mitigates the risk in some ways, but, as he pointed out with Gomez, it still exists.
Vollman also uses assists in his construction of this pass estimate stat. He admits further the risk inherent in this stat when he writes: "Relative to the pass, there is so much luck involved in an assist, that it becomes almost meaningless. As little influence as the shooter has on whether his shot gets blocked, hits a post, sails just wide, or is snatched by a hot goalie, the passer has even less influence—basically zero…Assists? No—the most we can expect of someone is to simply make the pass" (210).
So, this stat depends entirely on two figures that are largely driven by luck (shooting percentage) and something the passer has little influence over (assists). It’s unsurprising that once I tracked the Devils 2013-2014 passes and shot attempts generated, the problems of this estimated stat were laid bare to see.
Comparing Setup Passes to Shot Attempts Generated
Below you’ll see each player's SAG (Shot Attempts Generated) totals that I tracked this past season compared to their Setup Passes total that is now on Extraskater. Since, I tracked events only at even strength situations, I used only the even strength setup pass figures for this analysis. You can view the Devils setup passes here Scroll down to the production stat report and it is under "SP."
As you can quickly see, the setup passes is a terrible stat. If one were to believe it, you would be told that Bryce Salvador (67.9) generated more shot attempts this past season than Andy Greene (67.2). While this is obviously ridiculous, considering that we know Greene generated 108 shot attempts and Salvador only 36, it is patently absurd due to the fact that Salvador played in half the games that Greene did. Did he really produce on a game-by-game basis at such a high level to double Greene’s output? No.
Sticking with defenseman, the estimated stat is correct in showing Zidlicky as far and away the best distributor of the puck, but it estimated he set up 228.7 shot attempts, forty more than he actually did. Jon Merrill is the next highest with 105.8 (he really had sixty), and Anton Volchenkov is the third-most effective distributor of the puck on the Devils with 105 setup passes. I’ll let that sink in. Volchenkov only generated thirty-three shot attempts, not 105.
Overall, this stat estimated the defensemen generated 201 more shot attempts than they actually did.
Moving to the forwards, there are several discrepancies. Mattias Tedenby, Mike Sislo, and Jacob Josefson each generated fifteen, fifteen, and thirty-one shot attempts respectively that I tracked. Looking at the estimated stat, it shows zeroes for all three players. Another problem with using assists and shooting percentage to formulate your stats.
Some players were close: Ryan Carter generated seventy-nine shot attempts and was estimated to have set up sixty-seven; Tuomo Ruutu generated fifty-two shot attempts and was estimated to have set up sixty-four. Travis Zajac generated 253 shot attempts and was estimated to have set up 268.5. Those were about it.
Stephen Gionta generated only fifty-four shot attempts, but was estimated to have set up 114.3. Damien Brunner generated seventy-one shot attempts, but was estimated to have set up 166. Besides the players that set up "0" shot attempts, my favorite statistical absurdity might be Tim Sestito’s estimated sixty-five setup passes, which would put him just behind Andy Greene and ahead of Ruutu.
Overall, this stat estimated the forwards generated 315 more shot attempts than they actually did.
A Few Words on PSR
PSR is a pass-shot ratio that, unfortunately, relies upon the setup passes stat to be of any worth. Since we now know that setup passes is not something we should use, PSR, by extension also holds little worth, if any.
How can We Correctly Estimate Passes?
Ideally, each team's passing and shot-generation figures would be tracked in similar fashion to what I've done this past season for the Devils. I thought about this and wondered if there was a way to estimate based on my findings, since that would be a much more convenient way of doing it. I'm not sure there's an accurate way to, but we can try to use the Devils to establish a baseline.
According to Extraskater, the Devils attempted 3342 shot attempts at even strength. Through my tracking, 2646 of those were the direct result of a pass, or 79.2% of their Corsi total. Simply by figuring 79.2% of a team's Corsi totals would give you a simple way of estimating how many setup passes the team made. Sure, every team would be slightly different, but if you're looking for a quick way to estimate, it's not going to be accurate regardless.
For players, let's use Marek Zidlicky as an example here. While Zidlicky was on the ice, the team attempted 1197 shots. If we subtract his own shot attempts (168), you now have 1029 shot attempts. If you assume, for a baseline, that 79.2% of those 1029 shot attempts were generated via a pass, you now have 815 shot attempts. Divide that by the four other skaters on the ice (Zidlicky cannot setup his own shot attempts) and you arrive at 204, which is still not the accurate 188, but it's much closer than the estimation used in the setup passes stat.
I tried this with the rest of the Devils skates and the results were just as mixed as with the setup passes stat. Below are the results, along with the difference between what setup passes and this "79.2%" stat estimate were when compared with the actual tracked data.
When I applied this idea to the entire team, the results were mixed. In terms of the average difference between a player's SAG totals and their 79.2% figure, it did a worse job of estimating defensemen contribution, but a better job of estimating the forward contribution. On a team basis it was nearly exact (obvious considering where I took the original 79.2%), but the setup pass estimate said the team set up 516 more shot attempts than it did, a giant error. Overall, they both stunk.
The reason for doing this is this: do not estimate passes or passes that result in shot attempts. If I can play with some of the numbers I tracked this season and come up with a simple estimating formula that does as good a job as setup passes, then estimating can't be done accurately enough to be given any kind of validity.
Now, Vollman does a tremendous amount valuable work for the hockey analytics community and if you don’t already own Hockey Abstract you should buy it because it contains a wealth of information and insight. His creation of an estimated passing stat was the best the hockey analytics world had last year and he himself expressed the same concerns I have here regarding how much of its two components are luck-driven. I write this to simply provide a comparison between real, hand-tracked stats with definitions of each event (pass attempts and SAG), and the setup pass formula of assists and shooting percentage. Comparing stats to hard data is the only way to determine whether not there is any validity to them.
Over time, hopefully more people track passes and shot-generation in the way that I do. Only then will we have real data to work off of and not have to settle for estimates based on unreliable figures.
What do you all think of this "setup passes" statistic? Am I missing something in comparing it to the work I've done this year? Maybe you all have some better ideas on how to use my raw data to attempt to estimate production for other teams? As usual, sound off below and I'll respond.