So it’s been awhile since my last blog. In that blog, I tried to predict future LS-GAA using weighted data from the previous three seasons. In this blog I will use the same approach, but for more conventional on-ice metrics like corsi and expected goals. The purpose is not to discredit those metrics, but just to compare the predictability of my model to their predictability.
For this exercise I’m using unadjusted 5v5 data. The predictability would likely be a bit higher if I used the adjusted numbers, but the goal here isn’t to refine the predictability of on-ice metrics. I just want a comparison for LS-GAA.
Like in the last blog I’m using weighted data from 2016/2017, 2017/2018 and 2018/2019 to create projected values, and then I can compare these values with the actual data from 2019/2020. The 2019/2020 season is of course prorated to 82 games. You can read blog 10 for a better description of the method.
Let’s start by looking at on-ice goals. Here’s how the projected on-ice goal differential correlates with the actual on-ice goal differential:
It’s not easy to predict goal differential, and using goal differential from the past isn’t perfect. Even when we’re using data from 3 years the predictability is fairly low.
And here’s the predictability of corsi differential:
Now we see a much better predictability (R-squared = 0.311). It’s also better than it was for LS-GAA (see blog 10). This really isn’t surprising. The reasoning for using corsi instead of goals is because the sample size and repeatability are bigger. The problem with corsi is that it doesn’t correlate particularly well with goal scoring (see blog 1).
Finally, here comes the predictability of expected goal differential:
It’s less predictable than corsi, but more predictable than goals. These findings are very much in line with what I expected. Using this method LS-GAA (Skaters, all strengths) is as predictable as 5v5 expected goals. I think that’s very good – especially considering how well team LS-GAA correlates with team goal differential.
The problem with my model is goaltending. I can’t really predict LS-GAA_GK, which is problematic. I have some ideas, how to increase the predictability of the goaltending, but that will have to wait for another blog.
Lastly, I have also looked at how predictive corsi differential and expected goal differential are of actual goal differential. So here’s the actual on-ice goal differential as a function of projected corsi differential:
And here comes the graph for expected goals:
I’m a bit surprised that corsi predicts goals better than expected goals, but honestly neither works particularly well. In fact goals are a better predictor of goals (R-squared = 0.122). This goes against what I expected. I would have thought expected goals to be the best predictor of future goals, and maybe if I had used team data instead of player data, I would have come to that conclusion. That’s besides the point though. I just wanted to compare the predictability of LS-GAA with the predictability of some on-ice metrics, and here the results are encouraging.
- LS-GAA_Skaters is about as predictive as 5v5 expected goals.
- I don’t think on-ice metrics is the best way to predict future player performances. You could probably make a decent model using and weiging a number of different on-ice metrics (including goals). However, using one preferred on-ice stat out of context is not a great way to evaluate a player.
Stay safe and remember to be kind
All raw data from www.evolving-hockey.com