© 2013 Michael Parkatti BOTB5-4

What’s the Frequency, Ales?

Every team in hockey has at least one player that can be considered their go-to player — the guy you put on the ice at the start of a power-play, the guy you put on the ice when you’re down a goal late, the guy who every person in the rink knows will be selected for the shootout. Even when you think back to pre-novice days, there is always that player on a team. To me, the definition of this player is Doug Weight in the late 90’s/early-aughts Oilers. Watching games, you just knew Dougie was going to be sent over the boards in a key situation, and you loved him for it. It’s odd to think that for the large majority of seasons since Weight was traded in 2001, the player who’s filled that role on the Oilers is Ales Hemsky.

Like most red-blooded Oiler fans, I think Hemsky has been perennially underrated in his NHL career. It’s pretty hard to think of players who are stylistic matches — I ran some numbers a couple of years ago that suggested Ales has the highest assist to goal ratio of any winger in NHL history with over 100 pts, with something like 2.5 assists for every goal. But the guy can make goalies look absolutely foolish with his shot, when he actually chooses to use it — he’s a lot like Alex Tanguay in that regard.

For whatever reason (injuries), Ales has earned a less than ideal reputation around the league, which is curious for a guy who hovered around one point per game for six straight seasons. He should be a star in this league, his highlight plays should be on year-end countdowns, so what gives? Well, he’s been eclipsed by a trio of Canadian wunderkids, and it’s hard to see daylight around the shadow they cast. It’s well known that Ales still plays the tough competition, more so than the fresh rookies. It’s kind of a raw deal when you think about it — these hyped superteens are given cherry PP time, easy minutes at even strength, and are matched away from other team’s best defensemen, while Ales gets all the tough minutes and has to carry a line while doing it.

What makes Ales Hemsky’s 2013 season very hard to predict is that his points per game (PPG) mark fell through the floor last year, dropping from 0.89 to 0.52, or 42% lower. A lot of this is due to role — he switched from being the unquestioned first liner in 10-11 to being second fiddle in 11-12. His time on ice (17:36) dropped to its lowest level since 06-07, but still respectable for a guy who doesn’t kill penalties. This year the early reports are that he’ll be paired with Gagner and Yakupov. Gagner has shown good chemistry with Hemsky in the past, but Yakupov’s inclusion means this line could be fed some easier minutes than Hemsky’s typically used to. So how can we approach this prediction?

First, I decided to run a regression model. I gathered a dataset of all wingers since 1994 that played more than 200 games by the end of their 27 year old season, and found their PPG over their careers up to that point. I only included guys with a career average of 0.5 PPG up to that point in my sample. Then I needed to find out which of these players also played both 28 and 29 year old seasons of more than 20 games played. It ended up that I had 57 observations that fit this criteria. What I wanted to do was predict the 29 year old PPG using their up to 27 year old career PPG and their 28 year old PPG mark on its own as a trend identifier. Hemsky is entering his 29 year old season, and had an erratic 28 year old season relative to his past, which is why I structured it this way. When I ran the regression, I got the following equation:

29 yo PPG = upto27yo PPG * 0.353 + 28yo PPG * 0.643

For Hemsky, this translates into:

29 yo PPG = 0.81 * 0.353 + 0.52 * 0.643 = 0.62

I wasn’t entirely satisfied with this approach. I tried another route. What if I looked for the population of wingers that had a -30% drop in PPG scoring in any season between the ages of 25 and 30 (caveat being that their ‘higher’ scoring year needed to be at least 0.5 PPG to concentrate on ‘scorers’). Then I’d see how they responded the year after the drop, to see if the downward trend continued. I found 63 such seasons. I tried to run a regression model on this data, but it came out pretty terribly. The analysis I wanted to share was the following, a histogram of how this sample of 63 players did the year after an ‘off’ year (a season with a PPG drop of more than 30%):



You can see that 38 of the 63 players experienced a positive bounceback from their ‘off’ year, while 25 regressed even further. 27 out of the 63 had years that were within +/- 20% of their ‘off’ season, suggesting it was the new normal for them. 24 players (38%) had bounceback years of 40% growth from their ‘off’ season. The results are really all over the map, and is likely contextual for each player. Was the down year caused by a permanent shift in role, such as a drop from a scoring line to the bottom 6 forwards? Maybe they got Radek Dvorak’ed into becoming a defensive player (yes, he’s in this 63 player sample!). Or was it simply a blip on the path to bigger and better things (such as with Daniel Alfredsson?). I’d suggest that Ales will likely be a candidate to recover a lot of his offense, and sit on the right of this histogram — he’ll play easier minutes with a scoring line and be the feature player on the 2nd PP behind the kids. Any shakeup from the initial line combos will likely see him spend time with Taylor Hall, undoubtedly the bus driver of the forwards corps. And he’s also had injuries to muddle up the picture. This long lockout has likely benefited his health more than most players.

To finish, I decided to change tact. I decided to come up with a sample of all wingers who at least played 20 games in their 25, 26, 27, 28, and 29 year old seasons since 1994, finding 97 such players. Then instead of doing a formal regression, I decided to set up my own programming model that looked a lot like a weighted average model, except I wanted to set my own weightings that minimized errors. I set up a 29 year old forecast value that depended on sample weightings to put differing emphasis on the importance of the 25, 26, 27, and 28 year old seasons. Then I used Excel’s Solver add-in to tell me what combination of weights for the 25-28 year season PPGs results in a minimum of squared errors between the 29 year old actual season and the forecasted value using my equation. Essentially, I’m creating a moving average using past years’ data, and asking Excel to tell me how important each of the preceding seasons was in how the players did in their 29 year old season. Here’s what it told me:



Solver is telling me that the 25 year old season is useless in predicting the 29 year old season, so throw it away. In order of importance, it’s saying that the 26 year old season is tops with 43.7% of my weighting, 27 year old season is second with 37.1%, and the immediately preceding 28 year old season is actually least important, with 19.1% of the weighting. The equation looks a bit like a regression equation, as follows:

29yo PPG = 26yo PPG * 0.437 + 27yo PPG * 0.371 + 28 yo PPG * 0.191

For Hemsky, the calculation would be:

29yo PPG = 1.0 * 0.437 + 0.89 * 0.371 + 0.52 * 0.191 = 0.87 PPG

For fun I ran the multiple regression to see how its weightings would compare to my ramshackle homemade weightings — the sum of squared errors of my weightings were 21% lower than with the multiple regression weightings (which were statistically significant themselves, and much different than what you see above). The R-square correlation of my model is 0.723, which is pretty damned high. Let’s go with it.

Over an 82 game season, 0.87 PPG would end up being 71 points. Over a 48 game season, this would result in 42 points. I sort of doubt Hemsky plays the entire year, but this model is suggesting that his PPG mark could recover significantly. I could see this happening. But this forecast really is fraught with difficulty, and relies on qualitative observations as much as the quantitative ones, I suspect. I think the Oilers would need a pretty sizeable jump in team offense to be able to sustain 0.87 points per game for Hemsky. But it is possible.

Here’s the full 97 player sample size I used for the last calculation:


Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>