I’m one of those people whose major area of study in university is basically completely unknown by the general public. Whenever it comes up in conversation, I need to resort to describing things that it’s like, tools that it uses, topics it includes, or problems it can solve. I actually didn’t even know it existed until my second year of university. But every once and awhile, I get to pull it out of the toolbox and shove it in everyone’s face. Figuring out how to optimize the NHL’s impending divisional realignment seemed like a perfect application.

I’m a (former) student of Operations Research, or Operational Management in Canada, or sometimes known as Industrial Engineering in the United States. It uses mathematics, statistics, simulation, and a whole lot of Excel enslavement to solve problems — and by that I mean all kinds of problems. It can tell you how many tellers you need open at the bank to minimize wait times, it can tell you where to locate a business to maximize profits, or it can tell you where to drill for oil to maximize your expected yield. I tell people that it’s the science of ‘better’ — how to do something you intend to do a little bit better.

It came to prominence as an area of study during World War II. The British employed over a thousand mathematicians to tell them where to locate radar stations, where to locate airfields and how many airplanes each should have, and how to arrange their bomber formations to maximize damage over Nazi Germany. Many historians credit these reasoned processes with winning the Battle of Britain and turning the tide of the war. While not quite so important to the history of world, I’ll use similar concepts here to figure out how best to arrange 4 NHL divisions among its 30 teams.

The NHL is very close to agreeing on a divisional alignment, as follows:

I’m sure this is a reasonable suggested league alignment, but is it the best? How do you define what the ‘best’ divisional alignment is? I’m sure there are arguments that would be concerned with maximizing TV viewership, maximizing ticket sales, or simply keeping traditional rivalries alive. While all of these have merit, I’m going to concentrate on the alignment that minimizes the distance between each NHL team and every other team in its division. I’ll stick to the NHL’s proposed 4 division format, and allow each division to have either 7 or 8 teams. To do all of this analysis, I used Microsoft Excel. If you’re not interested in the methodology, just skip to the next graphic down the page…..

To begin, I needed to figure out the distances between each NHL city. I found a website where I could identify the geographical x,y coordinate (latitude and longitude) for each NHL city. I stuck to finding the centre of the city that each NHL team actually plays in, as opposed to finding the location of the airport or of the arena itself. Once I had the 30 coordinates, I created a 30 by 30 matrix that calculated the distance between each NHL city by using good ol’ high school math (you remember your distance of a line formula, right?).

Then I set up a 30 by 4 decision matrix, which listed each NHL city and the 4 theoretical divisions. I set up a Solver model whose task was to locate the 30 teams inside the 4 divisions. The constraints were that each of the 30 teams needed to be allocated to a division, and each division needed between 7 and 8 teams. Solver was restricted to decide the model using binary numbers, meaning either 1 or 0. 1 means it has allocated the team to that division, 0 meaning that it hasn’t.

This decision matrix fed a calculation matrix, which used my previous distance table to calculate what the distance would be between each proposed team in a division and every other proposed team in its division. The logic here is that minimizing divisional travel times is of primary importance in deciding league alignment. Therefore, my objective function was to minimize the sum of the distances between all proposed teams in every division.

This type of model is referred to as a discrete mathematical programming model because of the binary 1 and 0 nature of the decision being made here. To optimize it, Excel solver uses an algorithm technique known as branch and bound. Imagine a decision tree with literally millions of branches. This algorithm will explore the tree, and when it finds an area of the tree that isn’t close to optimizing the problem, it cuts it off and concentrates on the promising parts of the tree. It does this thousands upon thousands of times until it comes up with what it believes is the global optimum. Sometimes, it can go off track and explore an ‘inoptimal’ area of the tree by accident. To get around this problem, I set Solver to start analyzing the tree in multiple different areas, like a plinko ball being dropped down different parts of the plinko board. If you drop the plinko ball enough times, it should naturally find the global optimum at some point.

But I want to get across how difficult a problem of this size is. It doesn’t sound like much, but there are literally billions and billions of possible league alignments — if my back of the envelope calculations are correct, there were approximately 5.26 x 10^28 possible permutations of my decision matrix, or about ten million times more than all the grains of sand on the earth. It quickly becomes clear why techniques like branch and bound are necessary — even for a modern computer to go through all the possibilities one by one would likely take longer than the universe has existed. The computational complexity involved is a concept that’s known as NP-hard.

I let my model start solving, and it took about 12 hours of computing time for it to come up with an answer. It came up with the following suggested divisional alignment:

You’ll notice that the two Western divisions are exactly the same as the league proposal — right down to the teams it chose and total number of 7 in each division. Obvsiouly, this tells me that the NHL has employed someone to do a similar kind of analysis for them. The real deviation is in the Eastern divisions, where quite a bit of horse trading has gone on here.

Compare my new proposed league map:

With the league proposed map (courtesy of Bruce McCurdy from the Edmonton Journal):

The western conferences are obviously the same, but look at the difference in the eastern ones. My map creates a true ‘Great Lakes/Southern Atlantic’ division, with rust belt teams like Detroit, Columbus, and Pittsburgh joining Great Lakes teams Toronto and Buffalo along with the far southeast US markets of Tampa, Florida, and Carolina. The league proposal creates a very awkward division that’s basically a crescent moon around the Atlantic teams.

My last proposed division is a true representation of the Atlantic teams, with Ottawa and Montreal joining the tightly packed group around the 3 New York area teams with Boston, Philadelphia, and Washington D.C.. The league’s proposed comparable division for some reason starts pulling teams like Columbus from the American mid-west and Carolina from the Mid-Atlantic.

So how much travel does my proposed division alignment save versus the league’s? Here’s the total distance you’d need to travel between each team in a division and every other team in a division under the two proposed alignments:

Here, division 4 is the central one with Chicago, division 3 is the Pacific one, division 2 is the tightly clustered Atlantic division, and division 1 is the middle-ish division centred around Detroit. You can see that there’s no change in the western divisions, obviously, but a lot of variation between the eastern two. The Atlantic division is tightly packed enough to not be as large a difference (mine is 17.9% lower), but it’s the controversial middle division centred around Detroit that sees the most improvement (20.7% lower). Overall, my schedule has distances that are 9.2% lower than the league’s proposed alignment.

Now, I’m sure there were considerations like the maintenance of rivalries for the league’s proposal. Keeping Toronto in with Ottawa and Montreal seems to be important, and keeping Pittsburgh in with Philadelphia was obviously a main factor for this alignment as well. But, it’s not like those teams would never play each other, and wouldn’t it just improve the anticipation between those teams if they didn’t play each other all the time?

I’d be happy to share my work if Gary Bettman or Kevin Westgarth are interested.

## 5 Comments

“Keeping Toronto in with Ottawa and Montreal seems to be important, and keeping Pittsburgh in with Philadelphia was obviously a main factor for this alignment as well.” Not exactly. I think you need to replace “seems to be important” with “is required”. It’s required by the owners and more importantly, by the fans. There was a time that the NHL tried to balance divisions and Toronto ened up in the West. 1997-98 Season Now that they are back where they belong it will be virtually impossible to split them. An interesting exercise would be to repeat this for the East with certain rivalries required (PIT-PHI, MTL-TOR, etc.)

That’s a good suggestion, and I’m actually working on it right now…

Two comments related to the objective function, one objective, one subjective.

First, not all degrees are created equal. One degree of latitude is always 60 nautical miles, but one degree of longitude varies from 60 nautical miles at the equator to basically nothing near the pole. A reasonable first-order correction for this problem at the latitudes we’re dealing with is to multiply the difference in longitude by the cosine of the average latitude, so instead of

distance = sqrt((lat_1 - lat_0)^2 + (lon_1 - lon_0)^2)

you’d use

distance = sqrt((lat_1 - lat_0)^2 + (cos((lat_1 - lat_0)/2)*(lon_1-lon_0))^2)

This correction would make similar-latitude teams in the area we care about approximately 30% closer to each other, which seems like it might be enough to change the results.

(Of course, if you’re already calculating great circle distances, or something, that’s way more accurate, and ignore my comment. Your “you remember the distance of a line formula” line and table of geographical degrees made me think you might not have, though.)

The second, and more subjective point, is that not all travel is created equal. Specifically, international travel takes extra time, and what teams really care about optimizing is travel time, not travel distance. It seems like a reasonable estimate might be that a border crossing adds an hour’s worth of hassle each way. If you figure average travel speeds are 500 km/h, then it might be reasonable to add 500 km (or 4.5 latitude-equivalent degrees) for cross-border distances. Or maybe my correction factor is way off, I only gave this a few minutes’ thought.

Fellow OR student here. If you want some real oomph in your solving power, maybe take a look at GLPK. It’s basically an open source AMPL, and you can configure it to use CPLEX for some real number crunching power.

I don’t think the NHL needed an analyst to figure out those western divisions. There are 14 teams west of the eastern timezone, and there’s a pretty obvious divisonal border when you look at it on a map. The eastern conference is a whole nother story though. Your analysis seemed to optimize the “Divison 2″ teams based on their relatively extreme proximity to each other, and then leave all the leftovers in the other division.

Any thoughts on doing this for the other pro leagues?

## 2 Trackbacks

[...] couple of days ago I wrote a post in which I used discrete mathematical programming to optimize the NHL’s divisio…. I didn’t impose any constraints to my model beyond allotting either 7 or 8 teams to each [...]

[...] again use the tenants of operations research, a discipline I laid out in more detail recently while optimizing the league’s divisional alignment. My objective is to create a lineup which maximizes the average player-with-player Corsi of each [...]