Edmonton Journal scribe Jim Matheson caused a few raised eyebrows tonight when he suggested Oiler prospect Teemu Hartikainen could make the team in order to fill a Holmstrom type role, saying the Oilers are currently without such a player. Of course, many immediately suggested that the team does, in fact, have someone playing the Holmstrom role and this person may actually do it better than Holmstrom: Ryan Smyth.
Those who’ve followed me on Twitter know that a recurrent theme on my timeline is random Ryan Smyth career facts. Like that fact that he’s within 6 power play goals of breaking Glenn Anderson’s team record, for example. He’s one of the few players I can find no fault in, along with Bill Ranford or Dr. Randy Gregg. He’s a perfect representation of a hockey player — the archetypal Canadian winger. He may not be endowed by the Creator with the most talent in the league, but as I remember telling a Flames fan friend once, “Smytty just scores goals”.
So this got me thinking: is Ryan Smyth better than Tomas Holmstrom? Is there a way to prove it? Why yes, you obsessive weirdo, I believe there is.
Certain dull people would simply look at Smyth’s career point totals vs Holmstrom’s (806 vs 530) and declare the debate over. Less dull people would look at their career point per game totals (0.70 vs 0.52) and reach the same conclusion. These are all marks we are comfortable with, so why not just leave it there?
What if we could prove with a high degree of statistical confidence that Smytty’s career points per game (PPG) is higher not by random chance — that if both these players were to play hundreds of careers over again, Smytty would end up with a higher average points per game in all but maybe one? I’ll try to do just that. To begin, here is a graph that has all of their point per game averages by season, Smytty with 16 seasons of data with over 20 games played, and Holmstrom with 15:
You can see that they have fairly comparable career arcs, though Smytty certainly has periods of breaking away from Holmstrom. I decided to use a two-sample t-test, with unequal sample sizes (16 seasons vs 15) and equal variance. This is used to test whether two samples sizes come from populations with statistically different means (averages). In English, I’m testing whether Smytty’s average seasonal PPG over his career is statistically significantly different (higher) than Holmstrom’s.
The null hypothesis is that the two means of these sample sizes is not different, and that one player’s career seasonal average PPG cannot be called statistically higher than the other. The alternate hypothesis is that Smytty’s career seasonal average PPG is higher than Holmstrom’s.
First I checked whether I could use this type of test, which requires the variances of the population to not be unequal. I did a quick check for variance, and was able to conclude that the variances of these two samples are not unequal. Good.
Then I ran the t-test itself. Now, I chose a one-tailed t-test instead of a two-tailed t-test because all I’m trying to test here for is that Smytty’s mean is higher than Holmstrom’s. A two tailed test would simply check if their two means are not equal, either higher or lower.
I got a t-stat of 2.84, when the critical t value needed to reject the null hypothesis was only 1.70. Therefore, I can reject the null hypothesis, and accept the alternate hypothesis that Smytty’s career seasonal average PPG marks have a mean that is statistically significantly higher than Holmstrom’s. I found a P-value of 0.004, meaning that I can accept the alternate hypothesis with a 99.6% level of confidence. Put another way, if these two players played 244 entire careers, chance would allow Holmstrom’s seasonal PPG average to be higher than Smytty’s only once.
So, Mr. Matheson, or any other talking heads out there: which player would you rather have on your team — Ryan Smyth or Tomas Holmstrom?