Emmanuel Perry, hockey stat-boy wunderkind, profane tweeter, lover of elk, and creator of Corsica.hockey is hosting a prediction contest between different analysts and their models. It features big names in stats such as Dawson Springings (DTMAboutHeart, creator of most currently oft-cited WAR model), Dom Luszczyszyn (creator of Game Score), and TSN’s Scott Cullen. It’s an excellent program and it offers an objective commentary on where teams are projected to land. I’m going to run through a few of the models and talk about where the Devils are in them, and what the implications of that are. I should preface this by saying that I do have a Master’s in applied statistics, but have almost no experience in machine learning, and only very limited exposure to data mining. These are techniques heavily employed and so my descriptions will not be thorough with respect to those disciplines. I’ll do my best to maintain the integrity of the models and not misrepresent them through my simplifications.
The three models I’ll be looking at are Manny’s “Salad” model, Dr. Micah Blake McCurdy’s “Edgar” model, and Dom Lyzscyenszn’s “Preszczyszyn”.
“Salad” by Emmanuel Perry
The Model: Manny explains the model here. Though he doesn’t appear to explicitly say this anywhere, my guess is that this model is called salad because it is based off an ensemble of 11 “mini-models”. The models were chosen due to predictive power and noncollinearity, which is to say each mini-model is good on their own, and no two models are alike.
This model uses brute force strength of machine learning, feeding in 5,800 total variables — everything from Goals to Corsi/Fenwick, to WAR and K, to Star Ratings, — and choosing how to most effectively use each of the variables within the mini-models, and each of the models within Salad. The outcome tries the ensemble on previous years and makes choices that should minimize the log-loss of the model (log-loss is a statistic that penalizes overconfidence — you lose more points for being wrong about a 99% prediction than a 51% prediction). Manny’s model has no glaring failures in accountability, but sometimes will require some work to interpret based on nature of the algorithm.
The Devils: Manny originally had the Devils projected for 73 points or 29th in the NHL. Since then his model has moved the Devils all the way up to 90 points and 18th in the NHL, but still only a 22.5% chance of making the playoffs. Manny’s model continues to learn. It currently is leading the prediction contest and it’s rating of players and teams is automatically updating. The Devils have leapfrogged the Sabres, Rangers, and Red Wings and are projected to be 5 points behind the Flyers for the last spot in the East.
“Preszczyszyn” by Dom Lusczyczycyzyyzy
The Model: Dom Luscysyeycysyn explains his model — prounounced like “precision” here (WARNING: It’s from the Athletic, which is subscription-based so this link is behind a paywall). He utilizes a statistic he devised called Game Score. The stat for skaters takes into account points, shots, blocks, penalties, faceoffs, Corsi, and plus/minus. For goalies it takes into account saves and goals against. The last 3 years of data are considered for the model.
The model for predictions, uses this stat, and, through multi-variate regression, predicts performance based on previous data, sample size, age, and usage. This process makes team values which fuels simulations mapped onto the 2017-18 schedule and is iterated 50,000 times. The simulation results are used to make predictions. Dom Luusyguusy explains in his write-up that he didn’t account for injuries, used fantasy guides and DailyFaceoff to project lineups, and had difficulty accounting for rookies. He accounts for rookies using fantasy projections and team shot rates.
The Devils: Dom Luushamalamadingdong originally had the Devils projected for 78.8 points and last in the league (4% postseason birth chance). Since then, the Devils have risen to an 86.3 point projection which is 24th in the league (20% postseason birth chance). About a week ago, Dom Lushiousness added nhl data and projections for rookies. This, plus the Devils performance has fueled their rise. The Devils have jumped over the Sabres, Rangers, Red Wings, and Avalanche and are 6.7 points behind the Islanders for 8th in the East.
“Edgar” by Dr. Micah McCurdy
The Model: Micah explains his model here. He estimates unblocked shot and penalty rates using the past 2 seasons of data. Shots are adjusted for situation and score effects. This model is unique in that it actually simulates games by second — considering the probability that a shot for or against will occur (and, more importantly, where it will occur) and the probabilities that those shots will be goals. By the end of the game goals are calculated and a winner is decided.
Micah decided to favor interpretability over predictive power this time around. The data is not “trained” like Manny’s model, but it is easy to understand. He’s admitted weaknesses in not including coaching, competition/teammate quality, and age-effects. Rookie projections for skaters likely to log significant time were informed by Hannah Stuart.
The Devils: The Devils are projected for 86.4 points or 25th in the NHL, up from 82.3 and 29th. Their playoff chances have also gone up from 15% to 30%. Last I had spoken with Micah, there was not a concrete plan on how to account for relevant rookies. This may have changed since. The Devils have jumped over Buffalo and Boston and tied Montreal.
Manny’s machine-learning heavy model has us at 90 points with a 22.5% chance of finishing in playoff position. Dom Luspielstand’s game-score-fueled model has the Devils projected for 86.3 points and a 20% chance at the playoffs. Micah McCurdy’s scientific shot-, penalty-, and schedule-motivated model has us at 86.4 points and a 30% postseason probability. On average, these three models project us at 87.6 points and a 24% chance of making it to the playoffs for the first time in 6 years. This would be our highest point total since 2014 and, if not for being in an incredible division, would likely be even higher.
Each of these models has it’s own issue in assessing the Devils. Between uniquely impactful rookies and a possible return to form for Schneider, there may be significant misses that take a while to adjust for. With regards to young talent: Dom uses fantasy projections and some data, Manny just let’s his model learn, and Micah uses scouting and will adjust moving forward.
Regardless, the future looks good for this team, and this season could be the first. Projections have us on the playoff bubble. Despite some early luck, it’s nice to know the floor isn’t that low for the Metropolitan Division-leading New Jersey Devils.
What did you guys think of these models? What do you think of their methodologies? Do you think they are underestimating or overestimating this team for any particular reason? Do you think other teams are getting missed in these projections? How do you feel about the Devils playoff chances? Leave your thoughts below, and as always, thank you for reading.