While listening to a TSN Radio Show for some offseason hockey analysis, my interest was piqued when I heard a good hour-long conversation about which teams were "strong up the middle." This is a reference from a couple sports but most commonly it is used in baseball to describe a strong defensive player at catcher, shortstop, second base, and center field. I had heard it used in hockey before but never focused much on it. They discussed how 3 most successful recent teams (Chicago, Boston, and LA) are all perfect examples of being really strong up the middle of their rosters. This is in specific reference to having a stud top-line center, #1 defender, and starting goalie. I'd heard this cliche before with reference to hockey, but never analyzed it to any degree. So in this article I've decided to take a brief look into its validity, and then another brief investigation as to the implications for out team.
My analysis of quality of players is going to be done using Tom Awad's GVTs. For those unfamiliar with this statistic, it is very similar to the concept of WAR (wins above replacement) used in baseball. This is however measured in goals versus threshold. It measure the amount of goals the player adds or prevents with relation to a "threshold" player. If players were on more than one team, the stats may be a little screwy. If you do not want to bother with statistics lesson on the horizon now is when you should feel free to skip down to the chart showing the best GVTs at each position for each team and play with that before reading what my conclusions are.
The main statistical tool used in this investigation is called a "best subset analysis." To those familiar with this tool you may skip down a couple paragraphs but for those unfamiliar I will explain it to my best ability here.
In many articles on this blog and elsewhere, you will see a graph with a bunch of points and a line of best fit. In good articles you will see something posted called an "r-squared" value. This tells how well the variables are correlated. There is a tool used in statistics called linear regression and it is essentially the same process but it tests a specific variable's effect on the other. In short, it tests causation instead of just correlation.
A best subset analysis is a bunch of regressions and it selects the most predictive variables. This process is done automatically using software like Minitab. In the next section I explain how we use best subsets to determine the most important positions in hockey.
Best Subset Analysis Application
The variable I chose as our result is points because I thought that would be the best measure of team quality. The possible predictors changed depending on the analysis I ran because I can't have any overlap. Over the course of the entire investigation the following variables were used: Center, Right Wing, Left Wing, Max Wing, Defender, Defender 2, Goalie. The 2 new positions here were Max Wing which merely represents the best winger on a team, and Defender 2 which represents the 2nd best defender on a team. One thing to note is that the Devils top center was Henrique last year according to GVT.
I ran a total of 6 best subset analyses with different combinations of those positions. The following section will explain my results of the most predictive positions. What that means is that I am looking for which positions performance is most predictive of the teams ultimate point total.
Best Subset Results
I'd like to preface this by saying that no single result had an R-squared result over 95% which means none are definitively the be all end all of which is the most important position. Furthermore, these datasets are made of only 30 observations per variable (one player at each position per team in the 2013-2014 season) so even a convincing result would need to be used tentatively. This is an investigation as to which positions were most important to this years NHL teams' success.
There was one definitive result of this investigation. In every analysis I ran the first result was center. This was the runaway winner for most predictive position. The first 2 variables to show up as being the most important were Center and some version of a winger in all analyses. When it was an option, Max Winger was the most helpful, when it wasn't, Left Wing took it's place. This is already contrary to the "strong up the middle" mentality. It is worth noting however that the Goalie was basically tied with the winger. Though these were the strongest variables, alone they were only 67% predictive of total points. Statistically speaking, this is not strong at all, but being able to say that only 2 of the 23 players on a team can have anything more than 50% predictability was a surprise to me.
Most Important pair of positions: Center and Max Winger
As mentioned in the previous paragraph, Center and Winger or Center and Goalie are not strong enough alone to conclude they are all that is necessary. The best result in the study was unfortunately a tad trivial, but nonetheless enlightening. Evidently every one of the positions mentioned is valuable, but the duo of wingers is redundant and a Max Winger is all that is necessary. These four positions had an r-squared value of 88.5%. To put that in some context the total GVT of a team is 91.17% predictive of a point total. That means that after accounting for those top few players, the rest of the team doesn't add much value by way of predictive power.
Most Important Subset of Positions: Center, Max Winger, Top 2 Defenders, Goalie.
How are the Devils Doing?
A couple things to note here. The Devils representatives were chosen by highest GVT (I considered Time On Ice but got some really weird results). They are as follows with their NHL Rank for the listed position in parentheses:
C: Adam Henrique (21st in NHL)
LW: Partick Elias (17th in NHL)
RW: Jaromir Jagr (9th in NHL)
MaxW: Jaromir Jagr (14th in NHL)
Defender1: Andy Greene (11th in NHL)
Defender2: Marek Zidlicky (5th in NHL)
G: Cory Schneider (17th in NHL)
The only one that doesn't make sense here is Schneider, but this was mostly because of the games played differential between him and the top starting goalies -- of the 16 goalies ahead of him, 15 played more games (Cam Ward was the other one). The only surprising player on the list is Henrique who outperformed Zajac from a GVT point of view.
Below is a sortable table of all NHL Teams and their max position GVTs.
In total, when accounting for the top pair of most important positions (Center and Max Winger) the Devils Expected Point Total was 89.23 which is 18th in the NHL. However when adjusted for the more accurate final predictors including the defensive pair and goalie, the Expected Point Total goes up to 93.8 which is 14th in the NHL. Furthermore, the residuals show that the Devils were the 5th most under-performing team. For an explanation on Residuals please see my audition post.
Summary and Conclusion
There was a lot of statisticky mumbo jumbo in this post and for that I apologize. The long shot of it is this. The common refrain of a team needing to be "strong down the middle" to succeed, in my estimation, is not true based on these findings alone because of a few things. Though the Center is the most important position, they will need a Winger in order to succeed. Furthermore, having 1 strong defender is not enough because the 2nd defender is, in some cases, even more predictive of the teams ultimate point total.
The Devils had an above average set of top end players with regards to GVT, and would have been even higher had Schneider logged a few more games, once again proving that on the advanced stats sheet, this is a playoff team.
Thoughts and Suggestions
Do you think a team does need to be strong up the middle. How do you feel about the Devils' top-end talent? Do you think securing the studs first is how you construct a team or can you have a deep team without stars succeed?
With regards to this article, was it too dense? If you don't understand it where did I lose you? If you did understand it then what did you think of my conclusions? Were they well-reasoned or was their a gap in my logic? Please leave comments below.