The goal with this series is to build a new game projection model. In this first article I will focus on the different variables/metrics. I want to test how well they predict future results.
The old model
But first a few comments on the old model. It was built on the basis of Evolving-Hockey’s GAR and xGAR models. I started off by creating a descriptive model (sGAA), and then I converted that into a predictive model. This is a decent methodology and it worked well in year one.
Then this offseason I made some bad model changes. I have likely overfitted the data, and the result is a model that overestimates powerplays and penalties, and underestimates defenders.
This is how the model has performed this season:
The new model
So, I could just redo the old model, and I would probably end up with a decent model. However, I would rather rethink the entire methodology. Instead of converting a descriptive model into a predictive one, I want to build a predictive model right away. To do this I need to test how predictive the different available variables are – hence this article.
I also want a model that’s easier to update and hopefully easier to interpret.
Preparing the data
I want the model to be player based, so that the game projections depend on the line ups. This complicates things a fair bit, but it should give a better model.
So, before we can start the tests, we need to prepare the data set. The model will be build based on data from 2014/2015 to 2019/2020. I don’t include the 2020/2021 season, because it was shortened, there were no interdivision games and there were no fans in the buildings.
Basically, I want player data on a game basis. I did not include empty net situations, but the data is split into EV, PP and SH columns.
The variables I want to test are:
Even strength on-ice: xGF, xGA, GF, GA, CF, CA
Special teams on-ice: PP G+/-, SH G+/-
Individual stats: G, A1 (primary assists), A2 (Secondary assists), GAx (Goals scored above expected), Penalties taken, Penalties drawn
Goaltender stats: GSAx, GSAx (shot based)
You can find all of these variables in the data file. However, I’m not really interested in simple counting stats. I want to see if a player is better or worse than average (I calculated the averages per second next to the table). Obviously, a forward is expected to score more goals than a defender and on the powerplay you’re expected to score more than you’re at even strength. These things are accounted for by looking at above average stats.
The calculation is simply:
Impact = (Variable_1/second – average/second) * TOI(seconds)
xGA impact = (xGA/s – average xGA/s) * EV_TOI
I want to differentiate between forwards and defenders in my testing, so in total this gives us 14 forward variables, 14 defender variables and 2 goaltender variables for testing.
The test set up – Creating baselines
The goal is to see how each variable impact game predictions. To evaluate the impact I’m using log loss, but before we get that far I need to calculate the win probability for each team.
I want to isolate the impact of each variable, so in the beginning I assume all teams are equally strong (the team strength is set at 0.500). This gives us the following win probabilities:
Win prob. = Strength(team)/(Strength(team)+Strength(opponent))
Since the strength of all teams are set at 0.500, the result will always be:
Win prob. = 0.5/(0.5+0.5) = 50%
If the probability is always 50%, then the error of the game predictions will always be 50%, and the log loss will be:
Log loss = -ln(1-error) = -ln(1-0.5) = 0.6931
This will be our first baseline. The next step is to adjust for home ice advantage and back-to-back effects. When we do this the win probabilities will be:
|Both teams rested||54.33%||45.67%|
|Home team B2B||49.78%||50.22%|
|Away team B2B||58.81%||41.19%|
|Both teams B2B||54.33%||45.67%|
When we calculate the average log loss based on these win probabilities it comes out to 0.6881. This will be our second baseline. If we visualize the log loss as a function of game no, this is the result:
The final baseline I want to create is based on closing betting lines. Now the win probability is calculated based on closing odds (Source: https://www.sportsbookreviewsonline.com/). When we calculate the log loss based on closing lines it comes out to 0.6714 and added to the visualization it looks like this:
In the end we want a model that can compete with the market, so it’s a good baseline to set up. The analysis below will only include the trendlines to ensure that the graphs are easy to interpret.
The test setup – Adding variables
In this test I want to add each variable individually, so I can see how it impacts the log loss. I will still include the home-ice and back-to-back adjustments.
I’m going to let every player start each season at the same level (value of 0), and their value then changes based on their performance above average in the previous 50 in-season games. The team strength is then calculated this way:
Team Strength = 0.500 + SUM(Variable changes)*x
So, the team strength is calculated based on how the line up has performed (in the tested variable) previously in the season. The team strength will therefore always start at 0.500 and then change based on performance in the tested metric/variable.
The x-value is defined as the value where the overall log loss is at its lowest. This way we can test how each variable impact the log loss.
Even strength expected goals
Let’s start by testing the expected goals variables. For defender xGA the lowest log loss (0.6834) was found when x was -0.00654. The x-value is of course negative because having a low on-ice xGA is a good thing.
I did the same thing for forward xGA, defender xGF and forward xGF. You can see the log loss and x-value in the table below.
Finally, I combined all the xG-variables by keeping the x-values constant and then finding the y-value with the lowest log loss:
Team strength = 0.500 + ([D xGA]*-0.00654+[F xGA]*-0.00439+[D xGF]*0.00736+[F xGF]*0.00485)*y
The y-value is below 1, because the variables are dependent. [D xGA] is closely connected to [F xGA] and [D xGF] is closely connected to [F xGF].
In the graph below you see how the log loss changes as the seasons progress. I’m just showing the trendlines to make the graph easier to overview.
There are two things worth noting. xGF appear to be a better predictor of future results than xGA, and [F xG] predicts results better than [D xG]. Perhaps just because there are more forwards on the ice at even strength.
Even strength goals
Next, we will look at on-ice even strength goals. The procedure is the same as for xG, and you can see the results here:
And here’s the visualization of the trendlines:
Again, we see that GF is a better predictor than GA, but more surprisingly on-ice goals appear to predict results better than on-ice expected goals. This is different from what we see when we simply look at team stats.
Even strength corsi
Here’s the results when we add the corsi variables:
And here’s the visualization:
Corsi in this test is a worse predictor than both expected goals and actual goals.
The metrics I use for special teams are on-ice PP G+/- above average and on-ice SH G+/- above average. I could have used xG+/- instead, but for special teams I think it’s better to look at the actual results.
Here’s the results of the test:
And here’s the visualization:
Obviously, even strength performance is more important than special teams performance since most of the game is played at even strength… But accounting for special teams will still improve your model. PP performance appear to be a slightly better predictor than PK performance.
When I’m looking at points, I’m using points above average, so I account for position (F or D) and role (EV_TOI, PP_TOI and SH_TOI). Here’s the results:
And here’s the visualization:
Individual points seem to be a really strong predictor of future results – especially for forwards. This is probably something that’s underrated by a lot of analytics people, and likely one the reasons why Dom Luszczyszyn’s model is so successful.
Goals scored above expected (GAx)
I also tested goals scored above expected (shooting): GAx = iG-ixG
Here’s the visualization:
Defender GAx doesn’t appear to predict future results at all. Forward GAx is a decent predictor. I think there’s too much variance on long range point shots.
Goals saved above expected (GSAx)
I tested two goaltender metrics – normal fenwick based GSAx and shot based GSAx. I don’t think goaltenders have much impact shot misses, so I only want to credit goaltenders with the actual saves. I wrote this article about the subject here.
Here’s the visualization:
In terms of predictiveness there’s no real difference between the two models.
Finally, I tested penalties. Here’s the result:
And the visualization:
Penalty metrics have little to no predictive value. Adding penalty metrics for forwards might improve your model a tiny bit.
Comparing the variables
That was a lot of tables and graphs, but here’s all the variables sorted from best predictor (lowest log loss) to worst predictor (highest log loss):
The best predictors are individual forward points and offensive on-ice stats (GF and xGF). This is definitely something to keep in mind, when we get around to actually building the model.
The next step will be to combine the best variables in a smart way. Most of them are dependent on each other, so we can’t just include all of them… But that will be the subject for the next article in the series.