In a piece earlier this summer, I wrote about whether players can sustain on-ice shooting percentages from year to year. Follow that link to get the entire picture, but the short answer was yes, the data suggest that players can and do sustain certain levels of on-ice shooting percentage, suggesting it is a ‘skill’ that can be replicated and not simply luck.

The implications of that analysis are pretty obvious — in addition to shot production, a player’s on-ice shooting percentage should be considered a key tool in deciding their offensive value. A player who has shown an ability to maintain a high on-ice shooting percentage should reasonably be expected to continue that attribute, and vice versa.

As I was folding through the data for that post, however, I started to get the suspicion that the manifestation of this logic may not apply symmetrically between hockey positions, especially between forwards and defencemen. One look at this chart should tell you that there are a lack of defencemen on the extremes of the league in terms of on-ice shooting percentage. I’m getting sick of typing that out, let’s use OISP, k? It even sounds like a fun word!

So, I decided to explore this phenomenon to answer the question: who actually sustains OISP, forwards or defencemen? I’ll go through a series of tests of randomness (my favourite!) on two datasets — 82 defencemen who have played at least 30 games in 6 straight seasons, and 149 forwards who’ve done the same thing. This way we can try to isolate where the sustainability in OISP is coming from.

**OISP Leaders by Position**

Here we contrast the top 10 leaders in 6-year average OISP and the 10 players with the lowest average OISP by position. At first glance, you can start to see the divergences. The top defenders have OISPs about a point lower than the top forwards, while the bottom 10 defenders are about a point above the bottom forwards. The range separating the top and bottom forwards is almost 4%, for the defencemen it’s about 1.5%.

This means the forwards fly higher than the defencemen, but they also dip lower in terms of OISP.

**Normal Cumulative Distribution Chart**

This technique transforms a player’s OISP for any given year into a normal cumulative distribution score between 0 and 1. The chart above averages the 6 seasons of scores that I have data for. A mark of 0.5 means you’re right at the average of the overall player dataset for OISP. If OISP were a random metric, it should tend towards 0.5 over the long term, just as flipping a coin should result in 50% heads over the long term. You may have up or down seasons, but after 6 entire NHL seasons you should be trending towards the long term average if this was really randomized.

The difference between forwards and defencemen is readily apparent. The forwards’ chart is up and down around the central tendency of 0.5, some sustaining crazily high levels of OISP while others are very far down the axis. The defenders’ chart is much more bunched around 0.5, with outliers very rare and moderate in nature. The suggests that defencemen have a much harder time maintaining a very high or very low level of OISP compared to their peers than forwards do.

**Actual vs Expected Ratio of Above/Below Sample Average OISP Seasons**

This test treats the 6 seasons in my sample dataset as coin flips. There are buckets for the numbers of players who fit under each criteria — 6 seasons with above average OISP / 0 seasons with below average, 5 above / 1 below, etc. If OISP were truly a random phenomenon, the number of actual defencemen in each bucket should closely match the number expected by random chance, which I’d handily shown using a probability tree in an earlier post.

For forwards, we can see that there is a very strong tendency for players to have an extreme number of above or below average seasons (shown in blue), with a frequency much higher than you’d expect if OISP were random (shown in orange). For defencemen, the exact opposite is seen, with a stronger central tendency towards the middle than the edges. If anything, I might guess that defenders data is non-random the other way, in terms of being so strong a central tendency that they would show a long-term pattern of non-extreme performance — IE, that a defender could never lead a team in OISP, for instance, because he’s always being carried around offensively by the forwards on his team. Look at the top 10 lists above — isn’t it weird that the top 3 defencemen all spent time with far and away the league leading scorer for OISP in Pittsburgh (Crosby)? That two of the top 1o defenders played with two of the top 10 forwards in Washington? Forwards seem to be driving the OISP bus here.

We can also use Chi-Squared Tests that will tell us the probability of observing the difference between our observed and expected data if our actual data actually did have the properties of our expected data. So, what’s the probability that our datasets show nonrandomness when they actually are random? If that probability is less than 5%, we can accept the hypothesis that the actual data is different than what we expect. For forwards, our test score is 1.66896E-12, so well, well within our bounds to accept that OISP in forwards fits a non-random distrubtion, while the score for defencemen is 17.5%, meaning we cannot reject the hypothesis that OISP in defencemen is random.

**Regression of One Year to Predict the Next**

To tuck this in bed, let’s try to run a regression that uses one year’s OISP for a player to predict his next year’s OISP. If there is a statistically significant relationship between the two (ie, if we can use one year to predict the next), we will have a strong indication that OISP is not random, and is, in fact, sustainable from year to year.

For forwards, we have an R-square correlation of 0.06, which doesn’t sound like a whole lot, but results in a P-value of 1.31508E-11 for the predicting year variable, well below the 0.05 needed to accept a hypothesis than you can use one year’s OISP for a forward to predict his OISP for the next year.

For defencemen, the correlation is 0.001 between one year’s data and the next. The P-value is 0.475, way outside the 0.05 bounds needed to reject a hypothesis of randonmess. Therefore, we consider there to be no statistical relationship in a defencemen’s OISP from year to year.

**Conclusions**

Well, this should be obvious by now. I contend that defencemen are innocent bystanders in the on-ice shooting percentage circus, showing an inability to consistently maintain high or low levels. Whether they show relatively high or low levels in any given year depends on the forwards they are playing with and the bounces they get. Forwards lead their defencemen’s OISP either up or down.

On the other hand, forwards show a definite ability to maintain high or low OISP in a statistically significant way. Indeed, I would consider it to be one of the major reflections of a forward’s true talent level. The Alex Tanguay player type of not generating a lot of shots but scoring a lot on the ones you get is just as valuable as the opposite type. The superstar players are the ones who, like Crosby, can do both.

And this does make sense. In any given game, a defender will play with a wide variety of player types from the first through the fourth lines and in a wide variety of playing situations. Simply hiring Shea Weber will not increase your team’s ability to score more goals at even strength on a given number of shots — hiring a crack OISP forward will.

What’s that ability worth? Over the past 6 years, our dataset of forwards have an average OISP of 8.61% (meaning goalies have a 0.914 SV% against them). Let’s look at the difference between a 7% OISP forward and a 10% OISP forward. Assume that the average top 6 forward plays 1000 minutes of 5 on 5 play in a full season (reasonably close), and generates about 10 shots on net per 20 minutes of play, resulting in them being on the ice for about 500 shots in the season. Player A has an OISP of 7%, so is on the ice for 35 goals. Player B has an OISP of 10%, so is on the ice for 50 goals. As a rule of thumb, 15 extra goals results in about 2.5 extra wins, or about 5 extra points in the standings. 5 points was the difference between 6th and 17th in the league, or potentially between winning your division or missing the playoffs.

## 3 Comments

One other reason defencemen likely have a harder time driving OISP is that they take significantly fewer shots at even strength than forwards do, so even if a defenceman was in theory able to drive OISP given a high enough shot volume that will simply be swamped in practice by the tendency for shots to be taken by forwards.

I want to disagree with your last paragraph a bit though. One of the reasons that some players maintain a high OISP is because of linemates: high SH% players tend to play with high SH% players and vice versa. I wouldn’t consider it “talent” to say that three guys who all shoot 11% happen to play together multiple seasons in a row. There is some ability of the truly elite playmakers to raise their linemates SH% (as Eric T. has shown) but that ability is not significant for the vast majority of NHL players; even Sidney Crosby only boosts goal production in this way by about 8 goals per season.

Interesting analysis – though I am curious as to how opportunity affects the outcome. Those forwards who maintain high OISP from year to year see lots of PP time, better scoring opportunities. I couldn’t tell if your analysis focused solely on ES-OISP.

There is also a modest problem with your regression analysis; the player is the sample unit, not the player-year. So including each inter-year comparison for each player in one analysis inflates your sample size. It’s not surprising, then, that you have a significant r^2 but a very modest correlation. I think it would be better to conduct each year to year regression separately, then ask if you get a significant year-to-year correlation more often than you would expect. That said, it seems clear that things are a bit more “random” for defencemen, but this is probably down to the fact they are taking lower probability shots in the first place.

I’ve always wondered why Corsi doesn’t weight things in its calculation (i.e. goal>shot>blocked shot>missed shot).

One thing, I think, is missing from this analysis is st. Deviations. This isn’t a problem with just you, but with the way Adv. Stats folks in general present their work, and it’s a really, really bad habit.

For example, what if the ± on 153 Offensive players seasons (my rough count) is ±4% both in random chance and actual. Then the data doesn’t say much at all, because the range of the delta even just the top end (6A/0B, 5A/1B) is somewhere between -.6% and ~16%. The ramifications of the edges of those ranges are huge. But if the St. Dev is only, say 1/2%, then the analysis you’ve done is much more reliable.

153 or so is a decent sample size, but it’s not a huge sample size, there’s significant uncertainty in there. I think this analysis would benefit from showing it.