© 2013 Michael Parkatti SHD4

Combining Shot Differentials with Shot Distances

Recently I wrote a piece on the relationship between shot distance and scoring.  The main takeaway from that article is seen in the following chart:

Dist3

There’s a distinct non-linear relationship between how far a shot is taken from the net and how likely it is to go in.  This would suggest that teams and players that get closer to the net to take shots (or have a defence which keeps shots to the outside) will be more successful.  I stated in the previous article that I think incorporating shot distances into shot differentials is a natural and obvious enhancement to ratios like Corsi and Fenwick.  But how best is this accomplished?  I’ll propose a new methodology in this article.

First, a thought experiment.  Imagine a player has the following game in terms of even strength shots while on-ice:

  • Shots for: goal
  • Shots against: miss, save, save

How did he do?  The newspaper writer might look at this and say, “hell, he was +1, that’s all that matters”.  The stats blogger might look at this game and say, “hold on there, his unblocked shot attempt ratio was +1/-3, which is a Fenwick percentage of 25%.  He got lucky, and over the long run, if he sustains a similar percentage he’s in big trouble”.  Both positions have a kernel of truth in them, but is there a way to extend this conversation?

What if I told you the distances of the previous shots were:

  • Shots for: 11 feet
  • Shots against: 66 feet, 35 feet, 51 feet

Well, it seems that the one shot he was on the ice for was a gold-plated chance, while the three shots against were from further out.  Are the three further away shots still better than the one close one?  One way to settle this would be to look up the probability of each of those shots going in the split second they leave the stick, based on how far away they were shot from.  Regardless of whatever chaos theory needs to occur for that shot to result in a goal or not, what would we expect the probability to be if you freeze-framed your TV right as the puck leaves the blade of the stick?

The previous exercise I’d done resulted in a table of observed probabilities of unblocked shots of certain distances going in.  I could create a logarithmic function to model this relationship, but for now I’ll just use the pure observed probabilities seen through almost 3 years of shots data, or 219,881 even strength non-empty net unblocked shots.  Let’s translate the shot distances of our above player into expected goal probabilities:

  • Shots for: 15.2%
  • Shots against: 1.2%, 3.9%, 1.8%

Now, let’s add up these expected goal probabilities to come up with his overall expected goals for and against:

  • Expected goals for: 0.152 = 0.152
  • Expected goals against: 0.012 + 0.039 + 0.018 = 0.069

Based on what happened in that game, we’d expect the player to be on the ice for about 0.15 goals for and 0.07 goals against — this means his one dangerous shot was more than twice as likely to result in a goal as all three of the shots taken against him combined.

We can express this as a percentage, just like Corsi or Fenwick: Expected goals for/(Expected goals for + Expected goals against) = 0.152/(0.152 + 0.069) = 68.8%

This player’s team would have scored about 69% of the expected goals that night while he was on the ice, even though his Fenwick% was only 25%. Now, you can probably picture this same calculation being applied against all of this players’ shots for and against while on the ice to come up with his expected goal differential for a season.  I’ve gone ahead and done this for the current Oilers’ season so far, up to and including Oct 29′s game versus the Leafs:

SHD2

Some of these players have tiny samples so far, but remember that these small numbers involve the exact same number of events as their traditional Fenwick numbers, which are already in the hundreds for many of them.  The numbers shown here are the expected number of goals you’d expect them to be on the ice for based on their shot differentials and those shots’ locations.  You can already see some patterns emerge when you compare the expected goal % to traditional fenwick%:

  • Players like Ryan Smyth, Ryan Jones, and Jesse Joensuu are “known” as players who take the puck to the net, and these numbers bear that out.  Players of this type have value in terms of steering the play towards the net, through whatever means necessary (driving the net with the puck, getting rebounds, or creating enough space to receive a pass).  They are involved in higher percentage plays, resulting in EG% that are higher than their Fenwick% would usually show.
  • The only guys whose EG% is below their Fenwick% are Gazdic and Ben Eager.  I’d expect goons to do especially bad in this metric, as they get outshot and their shot distances for are probably quite far.
  • A highly skilled player like Ales Hemsky can have a much higher EG% than his Fenwick%, as he may be really fulfilling the stereotype of choosing to take a less number of shots in favour of getting quality looks at the net.

Now, you can also express these as rates per 20 minutes to get a sense of who are high and low-event players:

SHD3

Ryan Smyth is still the Golden God here, showing off his quality-shot creating prowess, but we also see that he gives up the highest amount of expected goals against per 20 minutes, so we’ll call that the cost of goods sold in this case.  The rare finds on this list are guys who have both high rates of creation and lower rates of allowance — players like Arcobello, Belov, and Hemsky have shown this ability with decent sample sizes.

You also see who tend to give up a high amount of quality chances — 2nd on this list for expected goals against per 20 mins is Nail Yakupov, who by reputation and by eye seems to be having issues on the defensive side of the puck.  Other players like Eberle on forward, and Nick Schultz, Smid, and Ference (!) also seem to be struggling here.  Also notice the chasm between Belov’s low expected goals against and the next highest defenceman.  Love that Russian.

And this all isn’t just for evaluating players.  You can use the exact same concepts to rate teams as well.  Now, I’m not technically inclined enough to replicate this for the entire league, but I can tell you what’s going on with the Oilers.  Their Fenwick% as a team in all even strength situations this year is 46.9%.  Their total expected number of goals for is 26.4 vs 27.1 expected goals against, for an EG% of 49.3%.  They’ve actually scored 28 and given up 35 so far.  This should tell you that their problems are related to their goaltending not reaching the league average levels the expected probabilities are based on (ie, their goaltending has sucked so far).

Remember, this is accounting for “quality” as seen in shot distance.  If the Oilers were giving up a ton of 10-bell chances and hanging their goalies out to dry, they’d have a higher number of expected goals against, and they just don’t.

There are other aspects of shot quality, such as if a shot was a rebound, a one-timer, shot after a lateral pass, a breakaway, etc which are not accounted for here.  But this is a simple way of accounting for the quality of all kinds of shots in a transparent way.  I’d imagine you could rig up your own expected probabilities to try to account for some of these things.  There’s a lot of ways this analysis could grow further.

7 Comments

  1. Posted October 30, 2013 at 9:14 am | #

    Have you seen our Total Hockey Ratings? It does this and then adjust for QoT and QoC, etc.

    http://www.statsportsconsulting.com/thor

  2. Pierce Cunneen
    Posted October 30, 2013 at 9:58 am | #

    Isn’t the problem with this type of analysis that the shot location data recorded by the NHL is so inaccurate and so horrendous?

    • Michael Parkatti
      Posted October 30, 2013 at 11:00 am | #

      I guess my counter to this would be that over hundreds of thousands of shots, those inaccuracies may even out. I look at the first chart on the page here, and think that NHL.com must be getting something right, as the combined distances result in a logical goal scoring function.

      • Posted October 30, 2013 at 1:17 pm | #

        The distributions don’t eventually even out. There are clear differences in shot distributions between rinks, even if you look at just shots faced as the road team. There are distributional adjustments that need to be made beyond make the means all equal. We took a first pass at these sorts of adjustments in the THoR paper.

  3. Tyler
    Posted October 30, 2013 at 10:30 am | #

    You can attenuate the problem Pierce raises by using road data.

    I fooled around with this stuff a decade ago and found that moth teams converged towards a mean. Slight differences, as I recall, on the team level. I assume this is why you can safely ignore shot quality most of the time.

    We didn’t have individual player stamped data at the time. I’d be interested to see if the differences matter that much. I suspect that they do but then I also think that we kind of roughly take this into account already, by acknowledging that shooting talent differences do exist within players.

    • Michael Parkatti
      Posted October 30, 2013 at 11:08 am | #

      I guess what’s most interesting about this is it may help explain why players have sustained better on-ice SH% when they obviously aren’t talented shooters. Smytty carries the load a different way than Hemsky, getting close instead of being a deadly shooter, etc.

      Hopefully I’ll be testing this to see if it’s a sustainable thing at a player level. I’d suspect that it is at the player level, but I have no idea ATM.

  4. G Money
    Posted October 31, 2013 at 1:04 am | #

    Great stuff. A couple of thoughts:

    - My company does similar statistical modelling of “ground truth” (microseismic monitoring of hydraulic fracturing) building models out of large samples of observed data. Rather than build a smoothed function (can be *very* difficult), We just smooth the data (simple box filter). Actually works better than unsmoothed data.

    - Presumably you could use exactly this analysis to come up with a probabilistic save percentage for goaltenders. One argument for Dubynk last year is that his save percentage was artificially lowered because of the lousy defense in front of him. This would help to either confirm or disprove that.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>