I think we can all agree that advanced hockey analytics are reaching a place where they’re gaining wider acceptance. When things like Corsi are being talked about on Hockey Night in Canada, I think we’re making strides. But I contend that the largest challenge facing this field isn’t its marketability; it’s about the kinds of insights we can provide. Beyond some basic axioms, what can we actually provide feedback on in terms of how the game is played? I may know that a player drives shot differential, but why does he do that?
In this regard, there is an absolute ton of knowledge left to explore and uncover about the game. I think Tyler Dellow’s on the right track with his Big Data series, breaking down hockey at the per shift level to find out what kind of patterns become visible. But in the end, there’s only so much we can analyze just looking at shots and other meagre offerings from the NHL’s RTSS feeds.
I think part of what powered the baseball sabermetric revolution was the initial realization that baseball’s traditional stats were not sacred cows. They were paid attention to by everyone simply because that’s what some guy decided was worthwhile to count in the mid-19th century. “Hey, wins are important, they’re right there in a pitcher’s statline!”. A holistic approach to reevaluate those metrics have led baseball to very fundamental questions and answers, helping them understand the very nature of what happens in their sport.
With respect to hockey, there is so much unbroken snow to trod through, partly because we haven’t really stopped to ask ourselves what stats we think are the most important in a perfect world. We analyse what we analyse simply because that’s the stuff we have access to. And I would agree with a viewpoint that of all the major team sports, hockey is the most dynamic — it has the most changes of possession, the very speed/acceleration of skating allows arguably the most chaotic movement around the playing surface, the physical nature of the sport allows old red herrings like toughness and heart to be even more ingrained, etc.
Understanding what’s actually happening in a game, statistically, is freaking HARD. The imprecision of the ‘gut feeling’ is plastered all over front offices and coaching staffs leaguewide, because that’s really all they have to work with. Hockey is like the chess world before Deep Blue beat Kasparov — the heuristics of the human mind can make decisions faster and better right now, because the analytics haven’t caught up yet. But just like in chess, the day of reckoning is coming, I assure you.
We talk a lot about possession in hockey. Almost always, we’re talking about Corsi or shot differential, but enough research has been done to show that possession closely tracks shot differential, so we’ve basically made the terms synonymous. But at a basic level, possession is important in hockey. The more you have the puck, the more likely it is you’ll be able to get the puck into the offensive zone, the more likely it is you’ll create a shot, the more likely it is you’ll score (or not be scored on), the more likely it is you’ll win the game, and the more likely it is you’ll make the playoffs, and the more likely it is you’ll win the Stanley Cup.
What I’m interested in here today is how possessions are both created and destroyed. Have you ever watched a game and noticed that the play dies on a certain player’s stick? Of course, everyone has. Have you ever noticed that a player has a knack for stripping pucks and taking back possession? Again, of course. But the RTSS feeds don’t really help us out here very much — they have Giveaways and Takeaways, but that’s a pretty narrow definition of possession creation or destruction — there are plenty more ways this can be accomplished. (Not to mention, those two stats seems pretty fluffy and badly-counted anyways, but that’s beside the point)
I want to know who creates a possession, how, and where. I also want to know who kills a possession, how, and where. So I counted it myself.
Let’s travel back in time to a very memorable Oiler game…….. January 22nd 2013! Oilers 3 – 6 San Jose Sharks. Oiler fans will remember this game, where San Jose scored 6 goals in the first period and coasted to victory. An entire game of score effects meant the Oilers battled back to barely lose the Corsi count 39-36. Yak and Schultz scored their first NHL goals.
I started by outlining what things I should be tracking that would help quantify how possessions begin and end. Of course, as the game was played this list of potential events increased as I understood what was going on a bit more.
- Hitting someone that results in a change in possession (CIP) within 5 seconds
- Winning a battle with someone that results in a CIP
- Takeaways – any act of taking the puck away from the other team while the puck is near their body
- Corralling a rebound from any shot against on net or missing the net
- Blocking either a shot or a pass mid-flight resulting in CIP
- Winning a face-off
- Free Plays — basically this is the opposition just handing it to you, unattributed to any Oiler player
- Faceoff losses
- Getting hit, resulting in a CIP
- Icing the Puck
- Losing a battle when your team originally had possession
- Dump-Ins that are not retained by your team
- Giveaways – any act of losing the puck while not sending a legit pass
- Bad Passes – passes that do not work, resulting in CIP
- Bad Pass Receptions – bobbling a pass that was put in the right place, resulting in CIP
- Putting the puck over the boards, or any act of stopping play that stops a possession
- Missed or blocked shots that result in a CIP
I call these the Determinants of Possession. I’m not claiming this list is the most efficiently outlined or even totally comprehensive, but it’s what I noticed during game play. You’ll notice that shots on net resulting in a play stoppage or CIP, etc is not on here as a possession ender, as that’s what I consider one of the goals of the gameplay hockey — getting shots on net. Ending a possession is connotative, to me, of being a negative thing — you actively did something which ended a possession before your team had a chance to score.
An important thing to wrap your mind around: I would not expect Possession starts and ends to be that far away from eachother, even in lopsided ass kickings. Think about it — changing possession has nothing to do with the duration of that possession. I freely admit this has nothing to do with all the good little plays a team makes in the middle of a possession, or all the bad little plays a team makes to continue not having possession. This is only meant to assign responsibility for the discrete event of forcing a possession change.
You could have the same number of starts as ends, but your average possession might have been 5 seconds, while your opposition’s might have been 50 seconds, in which case you likely got beat very badly. This isn’t about finding out which team has more possessions. This is about finding out which players tend to create possessions, or tend to lose possessions, where they tend to do it, and how. This is about creating data that a team could use to diagnose specific issues with their players or systems.
Here are the total possession starts and ends for the Oilers in the game. This isn’t terribly interesting to me. The Oilers had their toughest time in the first period, came back strongly in the 2nd, and then didn’t really accomplish a whole lot in the 3rd, blah blah etc
Here is where the starts and ends happened by zone from the Oilers perspective. Only 13% of their possession starts happened in the offensive zone, but almost 26% of their possession ends happened in their defensive zone. So, they seemed to be giving it away to the opposition like crazy in their own end, which would be the worst place to give your opponent the puck, as you’ve already ceded them the 170 or so feet of territory that didn’t require any effort of the Sharks. The best place to end a possession would be deep in your opponents end, and forcing them to start on their own 5 yard line. Giving them the puck in your own zone is like throwing an interception deep in your own zone, and your opponent starting from your 10 yard line. They’re already in the red zone, now they just have to score. A useful metric to track would be to minimize the percentage of possession ends that happen in your own zone, and to maximize the number of possession starts that happen in your opponent’s zone.
But let’s take this a bit further. Here’s the lineup:
So, the Oilers have loaded up their top line with their best players, started Yakupov on his less-favoured left wing on that stupefying 2nd line (that never seemed to work last year), and in many respects looks like a typical lineup from last year. Let’s have a look at how the forward lines did:
These numbers only take into account the starts and ends caused by each player on that line, this isn’t an “on-ice” statistic which also includes the defencemen they were playing with, because that would be super hard to do. But you can see that the Hall/RNH/Eberle trio among themselves forced 25% of their possessions starts to occur in the offensive zone, 5% higher than any other line could manage. So the first line had the least distance to travel to target the net, increasing the likelihood that they would be able to generate shots, and as I’ve shown this summer, good players can sustain higher than average on-ice shooting percentages. In other words, they’re putting themselves in a much better position to score.
On the flipside, the first line also had most of the their possessions end in the offensive zone — again, this is the best place to give up possession, as a) the Sharks now have to travel the length of the rink to your net to generate a shot, and b) you’ll have more opportunities to stop them before they get there. The first line also has the lowest percentage of possession ends that occur in the defensive zone — so, they’re coughing up the puck less to the other team in their own zone less, which is quite the feat considering Taylor Hall spent over 75% of his time at even strength playing against either Joe Thornton’s or Joe Pavelski’s line. Also notice that the first line is the only one that creates more possessions starts in the offensive zone than possession ends in the defensive zone.
So how do the individual players do? I’ve created the following graphic, grouping the players by position:
For this, I’ve taken out all face off data, so that we can fairly reflect on the differences between positions. Some things jump right out at you here: firstly, it seems centres and defencemen have more of the responsibility for creating possessions (especially centres). To the right I’ve indexed each player’s starts, ends, and net difference per 20 minutes just to normalize for icetime, and you can see that Starts/20 for the Centres are consistently high, while Ends/20 among the Wingers are very high, on average. So, centres get the puck, and wingers lose it.
Another thing to look at is how veteran players tend to do better than younger players. On a net/20 basis, Horcoff, Belanger, Hemsky, Smyth, Smid, and Nick Schultz generally do better than younger counterparts. Like nomadic Mongol tribes in the study of ancient civilizations, Ryan Whitney is always the exception, having the worst Net/20 statistic here no matter how much of a veteran he purports to be. Especially noticeable in the shaky rookie column is Teemu Hartikainen, who seems to cough the puck up at a rate 50% higher than anyone else.
That’s fun, but let’s look at each Determinant of Possession for each player on a per 20 minute basis to see if we can flesh out more detail:
I’d encourage you to click through the graphic to enlarge it. Some notes:
- Hartikainen’s problems seem to originate from a propensity to give the puck away, at a rate almost twice as much as any teammate. I’d say anyone critiquing Harski’s game last season would echo that sentiment — the puck just died on his stuck too much not to be noticed, and it showed up in this game.
- Eberle was outbattled for the puck at a rate more than double his nearest teammate. In raw terms, he lost 5 battles, with all 5 occurring in the offensive zone.
- Whitney was paired with Potter, and together they had the highest rate of turnovers on bad passes.
- RNH had the highest raw number of takeaways, with 5, but was eclipsed on a per/20 basis by Ryan Smyth and Eric Belanger. However, 3 of RNH’s takeaways were in the offensive zone, the best place for them to occur, while Smyth/Belanger’s all occured in either the defensive or neutral zones.
- It was a terrible game for face-offs, with only Gagner positively contributing to possession on the dot. Belanger was a train wreck here.
- Smyth and Paajarvi tended to turn the puck over a lot through ill-timed dump ins. Dump ins occurred a lot in the game, but many were successful, with the Oilers retaining possession and therefore not showing up here.
- This is one game. You can’t draw much from this other than interesting tidbits. To get a real sense of what’s happening, you’d have to do this over a large number of games.
And therein lies the rub. Any exploration of new data requires someone to collect it in the first place. Just like those brave souls that collected zone entries and exits last season, it requires a ton of effort for one single soul to do this kind of thing. This exercise may just be a one off in the end, but I wanted to open your minds a bit about what could be possible if NHL data collection started with an empty page. What is really important to look at? This is but one of many fascinating things I’m sure the community could come up with. And, yes, there is value in a single team doing this, but as we’ve seen with Corsi data, it is so much more worthwhile when something is done league-wide to benchmark against. If you agree, send a letter to your local member of Congress, or at least buy me some Mini-Eggs.