© 2013 Michael Parkatti lie.w.stats

The Statistical Relationship Between Hits, Shots, and Points

With the recent acquisition of Mike Brown, the Oilers seem to be falling prey to the mantra that you need to outhit your opponents in order to succeed in the NHL.  Many, many media members have asserted and reasserted this concept that, in short, hits are a proxy for toughness, and toughness is a proxy for male essence, and male essence leads to wins.

But what of it?  Is there any relationship between hits and shots for, or hits and shots against, or hits and standings points?

To answer this question, I gathered a data set that included hits, total shots for, total shots against, and standings points for every season each team has played in the last 5 complete seasons.  Then I performed a series of one-on-one regressions to test whether total hits had any effect on shots or points.  Regression is simply a tool that tells us what amount of variance in one variable can be explained by the variance in another one, and whether this relationship is ‘statistically significant’.  Statisticians require a threshold of 95% confidence in the displayed relationship before saying that it does, in fact, exist.  Note, this doesn’t mean 95% of the variance is explained by the variable, it’s just a reflection of confidence.

We use a statistic called a ‘P-value’ to decide what the level of confidence is.  If a P-value is lower than 0.05, we can say we are more than 95% confident of statistical significance.  If it’s above 0.05, we can say that we are not confident in any relationship.  Here are the P-values of total hits against the three variables shots for, shots against, and standings points for both the last 5-years and just looking at last year:

hits1

We can see that none of these are below 0.05, therefore we can reject the hypothesis that total hits has any statistical significance on any of these variables.  The only one that’s even remotely close is the effect of hits on shots for, which came within 10% of the confidence level required to accept it.  It was suggesting that every 13.5 more hits you make in a season results in one extra shot on net that season.  Either way, it’s still rejected.  All of the other P-values are so high that we can easily reject that any relationship exists between them.

However, Tyler Dellow of mc79hockey has made great points on Twitter recently about how using total hits is flawed, because certain arenas overcount hits, making the methodology inconsistent between arenas.  The only way to control for this is to strip out all home hits and only count road hits, so that you have a varied subset of arenas in your sample without a dominant number of home hits.  So I re-ran my regressions only using road hits.  Here are the results:

hits2

Again, not one of the P-values is below 0.05, in fact none of them are even close.  We can safely reject the hypothesis that total amount of road hits in a season has any effect on shots or standings points.

Also, not one r-square correlation was above 0.04, while most were below 0.01… meaning that less than 1% of the variability in shots or points could be explained by the variability in hits.  The relationship is statistically random — some high hit teams are good, some are bad, and vice versa.  There must be something else to explain why good teams are good?

Handily enough, I had the variable right in front of me in this dataset.  I took shots for and subtracted shots against — coming up with the overall shot differential for each time in each season for the last 5 years.  I then used that as the explanatory variable in a regression against standings points.  What were the results?

hits3

In short, the results are unequivocal: there is an incredibly strong relationship between shot differential and standings points.  I’ve outlined the P-value of this regression in red.  What does that even mean?  Well, you remember that you need a P-value lower than 0.05 to assert statistical significance, right?  The lower the number, the more confident you are in a statistical relationship.  The P-value in red is 0.000000000000015.  Or, I can say with 99.9999999999985% confidence that there is a statistically significant positive relationship between shot differential and standings points.  Compare this P-value to the ones in the two tables above, and you decide which one is more important to winning.

The regression equation is:

Standings Points = 91.68 + Shot Differential * 0.0275

This implies that a team with an even shot differential will end an 82-game season with 91.68 points.  Every 36.4 more shots that your differential increases by over a season provides you with one more standings point.  Conversely, every 36.4 shots your differential drops by in a season will cost you one point.  At the Oilers current shot differential rate this season of -5.5 per game, over an 82 game season you’d expect them to have a shot differential of -454, meaning this equation would predict that they would finish an 82-game season with:

Standings Points = 91.68 + -454 * 0.0275 = 79.2 standings points.

Right now, they’re on a 78.3 point pace if this was an 82-game season… so I’d say this regression equation holds pretty damned well.

Any questions?

3 Comments

  1. Shiny
    Posted March 7, 2013 at 10:24 am | #

    So are you saying that if the Oilers out shoot their opponents they’ll have a greater shot differential? Also, I know it was statistically dismissed, but are hits related at all to puck possession and shot differential? For example, Western Conf. vs. Eastern Conf. in which teams in the East hit less and have higher scoring games than teams in the West, which has a literally, heavier style of playing and is typically lower scoring?

    Or am I completely over thinking things and missing the point altogether (which is not out of the realm of possibilities for me)?

    • Michael Parkatti
      Posted March 7, 2013 at 2:50 pm | #

      Basically, I’m saying that outshooting your opponents will lead to more wins and standings points, while more hits doesn’t correlate with points, or shots. Hits were not correlated with either shots for or shots against.

  2. Travis Towns
    Posted March 8, 2013 at 1:10 pm | #

    I would like to see hits vs takeaways. My immediate assumption is there must be a relation between these two stats.

2 Trackbacks

  1. [...] complicated, but it’s valuable to consider. Michael Parkatti of Boys on the Bus made a quick model of points as a function of shot differential, and there’s a strong correlation (unlike, say, for hits). Whether you choose to believe its [...]

  2. [...] on-ice product was ugly. The Jackets were outshot most games, usually handily. That kind of hockey does not translate to wins, and if not for the brilliance of Sergei Bobrovsky, the club’s record would have sank into [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>