2007-10-17

## Generating Combinations: 1

Posted by scvalexWikipedia defines combinations as:

In combinatorial mathematics, a combination is an un-ordered collection of unique elements. (An ordered collection is called a permutation.) Given S, the set of all possible unique elements, a combination is a subset of the elements of S. The order of the elements in a combination is not important (two lists with the same elements in different orders are considered to be the same combination). Also, the elements cannot be repeated in a combination (every element appears uniquely once); this is often referred to as “without replacement/repetition”. This is because combinations are defined by the elements contained in them, s the set {1, 1, 1} is the same as {1}. For example, from a 52-card deck any 5 cards can form a valid combination (a hand). The order of the cards doesn’t matter and there can be no repetition of cards.

Mathworld provides a more terse definition:

The number of ways of picking k unordered outcomes from n possibilities.

The **combinations of n elements chosen as k** is the number of unique ways of selecting

*k*elements from a set of

*n*.

From now on, by set of *n* I always mean one of the form *{1, 2, 3, …, n}*.

So, what are the ways of choosing *2* elements from a set of *4*, *{1, 2, 3, 4}*?

`{1, 2}`

{1, 3}

{1, 4}

{2, 3}

{2, 4}

{3, 4}

That’s *6* ways, but what is the general formula?

This is easily proved: for a set of *n*, there are *n* ways of choosing the first element, *n * (n - 1)* ways of choosing the first two elements, …, *n * (n - 1) * … * (n - k + 1)* ways of choosing the first *k* elements. Unfortunately, this will generate duplicate subsets: for every subset of *k* elements, this will generate all the *k!* permutations of the subset. So, we have to divide the total number of subsets (*n * (n - 1) * … * (n - k + 1)*) by the number of repetitions (*k!*). This yields exactly the formula noted above.

Combinations are an astoundingly wide-spread concept, and are used in every branch of mathematics and especially in the analysis of algorithms. This said, there’s only one thing you really **need** to know: how to apply the formula.

Look at the formula above, notice that there are exactly *k* factors in the nominator and *k* factors in the denominator. So, to remember the formula and easily apply it:

**P1**. Draw the fraction line.

**P2**. Above the line, write *k* terms of the form: *n*, *n - 1*, *n - 2*, …

**P3**. Below the line, write *k* terms of the form: *1*, *2*, *3*, …

And now for the **fun part**. How do you generate combinations? Look closely at the example above. First thing to note is that every combination is an array of *k* elements. Next, the first digit in every set is, basically, every digit between *1* and *n*. What about the other digits? They’re always between *1* and *n* and they’re always in ascending order. Now it should be obvious what the algorithm is:

**P1**. Start of with *(1, 2, …, k)*; this is the first combination.

**P2**. Print it.

**P3**. Given the combination *(c _{0}, c_{1}, …, c_{n})*, start from the back and for

*c*, if it is larger than

_{i}*n - k + 1 + i*then increment it and go on to the next indice

*i*. After this, if

*c*, then this is not a valid combination so we stop. Otherwise give

_{0}> n - k*c*the values of

_{i+1}, c_{i+2}, …*c*. Jump to

_{i}+ 1, c_{i+1}+ 1, …**P2**.

Here’s the sourcecode in C (comb1.c):

**NOTE:** Source is mangled by Wordpress. Download the source file, or copy-paste it from here or remember to replace the amp-s with ampersands and the lt-s with “less then” signs.

#include <stdio.h> /* Prints out a combination like {1, 2} */ void printc(int comb[], int k) { printf("{"); int i; for (i = 0; i < k; ++i) printf("%d, ", comb[i] + 1); printf("\\b\\b}\\n"); } /* next_comb(int comb[], int k, int n) Generates the next combination of n elements as k after comb comb => the previous combination ( use (0, 1, 2, ..., k) for first) k => the size of the subsets to generate n => the size of the original set Returns: 1 if a valid combination was found 0, otherwise */ int next_comb(int comb[], int k, int n) { int i = k - 1; ++comb[i]; while ((i >= 0) && (comb[i] >= n - k + 1 + i)) { --i; ++comb[i]; } if (comb[0] > n - k) /* Combination (n-k, n-k+1, ..., n) reached */ return 0; /* No more combinations can be generated */ /* comb now looks like (..., x, n, n, n, ..., n). Turn it into (..., x, x + 1, x + 2, ...) */ for (i = i + 1; i < k; ++i) comb[i] = comb[i - 1] + 1; return 1; } int main(int argc, char *argv[]) { int n = 5; /* The size of the set; for {1, 2, 3, 4} it's 4 */ int k = 3; /* The size of the subsets; for {1, 2}, {1, 3}, ... it's 2 */ int comb[16]; /* comb[i] is the index of the i-th element in the combination */ /* Setup comb for the initial combination */ int i; for (i = 0; i < k; ++i) comb[i] = i; /* Print the first combination */ printc(comb, k); /* Generate and print all the other combinations */ while (next_comb(comb, k, n)) printc(comb, k); return 0; }

Always open to comments. Have fun.