Original article was published on Artificial Intelligence on Medium
A Combinatorial Curiosity
A beginner’s guide to combinatorics, and using it to solve a rather tricky question.
Let’s make sure we understand what the question means! (If you already understand the notation, you can skip to the section called ‘solving’)
A useful bit of mathematical notation is the factorial notation; this is a number followed by an exclamation mark. The best way to understand it is an example. 5!=5*4*3*2*1 = 120. 3! = 3*2*1 = 6. 10! = 10*9*8*7*6*5*4*3*2*1. In general N! means multiply all the integers upto N together.
Factorials are useful when looking at the number of ways of rearranging things. Suppose you have 3 balls of different colour, and want to know the number of ways you can arrange them in a line. You might work it out by hand, and discover there are 6 ways of doing this. Alternatively, you might reason more abstractly, and argue that there are 3 choices for the first ball in your line, then 2 for the second, and only one for the last, giving 3*2*1, or 3 factorial. A tree diagram represents this nicely.
As can be seen, the 6 combinations are Red-Green-Blue, Red-Blue-Green, Green-Red-Blue, Green-Blue-Red, Blue-Red-Green, Blue-Green-Red. These 6 combinations are all generated by our tree, which visualises the logic from previously: three branches for our first choice, and then for each branch, two branches for the second choice, and then one branch for the third choice.
The number of ways we can arrange 10 differently coloured balls in a line would be 10!, and with N balls there would be N! ways of arranging them.
It’s important to remember that the factorial notation represents numbers, just in a convenient way for certain tasks. Like other numbers, they can be added, divided, multiplied, subtracted and so on.
‘N choose k’ asks the question, from N people, how many ways can you form a team with k people in it. For instance, 10 choose 3 asks how many ways can you form a team of 3 people given 10 people to choose from.
Our first pass at this might be to think as before: we have 10 choices for our first player, and 9 for our second, and eight for our third, then we stop, as we aren’t choosing any more players.
But this isn’t quite right — this tells us the number of ways we can order 3 people in a line given 10 people to choose from. However, this isn’t what a team is! It doesn’t matter in which order we pick the players*, what matters is the team we get at the end.
*Although players’ egos might be bruised if they are picked last, this is not a mathematical consideration.
So, we adjust our original figure, and divide it by 3!. (Recall 3!=3*2*1=6.)
That’s because, for every team of 3 people, there are 3! ways of arranging that team in a line. So the clever insight was to spot that our 10*9*8 includes each team 3! times. Therefore, if we divide by 3! we get the total number of distinct teams.
There’s a special notation we use for this idea of ‘n choose k’
Now we can get onto the problem at hand!
There’s two ways to look at this problem. The first way is simply as some algebra, which you could expand out and try to solve. There’s nothing wrong with this approach, but it’s not exactly elegant is it?
The second way is to view it as a combinatorial problem.
The right hand side of the equation asks, out of 2n people, how different teams of size n are possible?
But the left hand side of the equation does not have such an easy interpretation.
The first thing we do is note that ‘N choose k’ is equal to ‘N choose (N-k)’. E.g. the number of teams if size 4, chosen from a pool of 12 people, is the same as the number of teams of size 8 (8=12–4) from a pool of 12 people. That’s because, once you have picked your team of 8, the remaining 4 players are a team of 4. Equivalently, picking a team of 4 players out of 12 leaves 8 players remaining, which means by choosing a team of size 4, you have also created a team of size 8. So every unique way of picking a team of size 4 also created a unique way of picking a team of size 8, and vice versa.
We re-write the equation to reflect this. Our new goal is to prove the following:
If you are less comfortable with the n’s, k’s and N’s, and abstract notation, do not worry! We will first solve an actual example, which is conceptually the same as proving the general case.
Let’s consider the case where we are trying to pick 11 players for a soccer team from 22 possible players. That’s 22 Choose 11 (the right hand side of our equation). Now we shift our perspective. We split the 22 players into two groups, each of size 11. Perhaps two local clubs have merged, and we need to choose a new first team from both their previous first teams.
Every time we pick a team, we are choosing some players from each group. For instance, we might pick 6 players from the first group and 5 from the second to form our 11 person team. I call this a ‘Split’. A ‘Split’ is a way of dividing between the two teams how many players they get to contribute to the joint team.
Let’s consider a (6,5) split. The number of ways of picking 6 players from the first team is ’11 Choose 6′, and the number of ways of picking 5 players from the other team is ’11 Choose 5′. Multiply these together, and you get the number of ways of picking a team of 11 people where you pick 6 from the first team, and 5 from the second team.
We can at most pick 11 from one team, and 0 from the other, then 10 from one team and 1 player from the other and so on: (11,0), (10,1), (9,2), (8,3) (7,4), (6,5), (5,6), (4,7), (3,8), (2,9), (1,10), (0,11). For each ‘Split’, we get a number of possible teams we can build. Summing over all ‘Splits’ we get the total number of teams we can build.
The logic for the general case of with ‘2N Choose N’ uses identical logic, just with slightly more general language. We calculate the number of possible ways of building a team given a Split, and then sum over all the Splits. Each split is one of (0,N), (1,N-1), (2, N-2), (3, N-3),…(k,N-k),…(N-1,1), (N,0). And the number of ways of building a team given a (k, N-k) split is: ‘N choose k’ times ’N choose (N-k)’. Summing over all k gives us the left hand side of our identity.
The insight here is to note that there are multiple ways of splitting the same numbers into parts and summing over the parts. Combinatorics helped us identify that the left hand side of the equation represented a way of splitting the number of the right hand side into parts which summed to the whole.
It’s a very human approach; a computer is very brilliant at multiplying out and cancelling terms, but I think it is harder to formalise the human ability to conceptualise a number in multiple ways and prove things using these conceptualisations.