Picking up on my series about a new shot-quantity/shot-quality hybrid measure I’ve been calling Expected Goals, I wanted to share some of the predictive testing I’ve done on it using only the 2011-12 season, the last full season of NHL play. I would have posted this earlier, but I was too busy finding awesome sweaters in Toronto to finish the numbers.
For all the numbers in this post I’m referring to non-empty net even strength stats, which includes all varieties of even strength play (4 on 4, etc). I’ll also be referring frequently to the concept of team points percentage — to calculate this I just used a team’s total standings points divided by the total available points they could have had at that point in the season.
First I wanted to explore overall predictive value of a suite of measures over the entire season. Basically, if I were to have no idea how the teams finished the season in terms of points, how well could I predict those points using a full season’s values of each of a list of independent variables (separately, not together). I know x team had y corsi, so I predict they’d have z points, etc.
This is as good a time as any to define some of the independent variables. Corsi is a measure of all shot attempts. Fenwick is the same as Corsi, it just takes out blocked shots. EG% is the new metric I’m using, Expected Goals, which is related to Fenwick but assigns each shot taken a probability of being a goal based on its distance and shot type (sort of a weighting for each shot). Goals is a percentage of even strength goals a team has. <25ft is a percentage of all unblocked shots that a team has within 25 feet of the oppositions net compared to how many are taken within 25 feet at their own net. PTS%-STD stands for ‘Points Percentage – Season to Date’ — so, fairly self-explanatory. Most of the measures also have complementary “Close” metrics, which strip out any play that occur in non-close games (a close game is within 1 goal in the first 2 periods or tied in the 3rd period).
Of course, the highest R-square correlation to a team’s full season points percentage is goals. It’s the ultimate record of what happened over any number of games in a season, as goals are what decide who wins a game and who loses it. By the end of an 82-game season, it’s hard to argue against the logic of a team’s even strength goal differential being a pretty good indicator of that team’s strength.
We also see traditional shot metrics like Corsi and Fenwick being further down the list in terms of r-squared correlations. The advantages they have in accumulating large sample sizes loses potency over 82 games, but this particular application is never really their strong suit. Their close measures do get somewhat closer to goals. But most interesting to me is the results for the distance weighted metrics here, EG% and <25ft. In fact, EG-Close almost does as good a job of reflecting how good a team is over an entire season as goals themselves (0.77 vs 0.79 r-squared).
In any case, this is just stage setting. Let’s have a look at how good these measures are at predicting the future for 2011-12. Let’s pretend it’s October 29, 2011.
I prefer picking points in time as opposed to exact number of games played, as if we were to do this exercise in any current season, we can never have the luxury of stopping time without disregarding the most recent slate of games played by some teams. October 29th is about three weeks into the season, or after 8-13 games have been played by each team.
What I’m trying to do here is take each measure for all teams up to this point in the season, and try to predict what their points percentage would be for the rest of their games based on them.
The metrics that most mainstream media guys would use to diagnose team strength — goals and points — are about as accurate as throwing darts at a dartboard at picking how well those teams will do for the rest of the year.
CorsiClose is the best predictor at this early point in the season, followed closely by just plain old Corsi. Corsi will have the highest sample size of all of these metrics, which is why it stars early, and I’m guessing the unbiasing effect of looking at close situations just slightly helps CorsiClose overcome its sample size problems.
Clearly in third place is raw Expected Goals.
Let’s see what happens when we look at games up to November 15th, 2011, or after between 15-19 games played for each team:
CorsiClose is still clinging to the lead in predicting how a team will play from that point onwards, but is being quickly approached by a pack of upstarts. Raw Expected Goals is second, raw shots within 25 feet is third, and expected goals in close situations is fourth.
Bringing up the rear are again season to date points percentage and goals percentage.
Let’s again move forwards in time to see how the metrics up to December 25th can predict a team’s performance for the rest of the year (after about 32-36 games played):
By the Christmas break, shots taken within 25 feet seem to have a very high level of utility for predicting the rest of the season, with Expected Goals % now in second spot.
Again, the worst are goals and season to date points percentage.
Let’s have a look at how each team was doing in each of these at this point in the season in detail:
I’m not going to delve too deep here, but I’ll point your attention to an interesting case. Firstly, the Leafs. They started out by getting 57% of possible points before Christmas but only 41% of them afterwards. They scored 52% of the goals up to that point in the year and their shot metrics were all just shy of 50%, but their expected goal marks and <25ft shots metrics were all lower at around 45%-47%. In this case, the distance weighted metrics did seem to portend how the rest of their season would play out rather than the traditional shot metrics.
You can go through the chart and say “a-ha! there one!” for either showing where EG% or <25ft did better or worse than traditional shot metrics, so I’d emphasize the r-squared numbers above that reflect the aggregate effect.
I’d like to continue this exercise by testing against earlier seasons and at the player level as well.
But I’ll leave you with a thought. Traditional shot metrics like Corsi or Fenwick cannot be improved upon — their predictive power is what it is. But something like Expected Goals % is something derived by the analyst. It requires rules, definitions, testing, formulas, and some creativity. I’ve done some fairly bare bones analysis to date incorporating simple shot distances and shot types. There are other ways that I believe this formula can be improved upon. I’m not sure how much of those improvements I’ll publish, but they can only help the utility of the metric from its current point, not hinder it. And considering that it already has some promising results here, I’d guess that there is indeed a lot of fertile ground left to explore to make it even better.