# Twelvefold way

In combinatorics, the twelvefold way is a systematic classification of 12 related enumerative problems concerning two finite sets, which include the classical problems of counting permutations, combinations, multisets, and partitions either of a set or of a number.

The idea of the classification is credited to Gian-Carlo Rota, and the name was suggested by Joel Spencer.

## Overview

The functions are subject to one of the three following restrictions:

There are four different equivalence relations which may be defined on the set of functions f from N to X:

- equality;
- equality up to a permutation of N;
- equality up to a permutation of X;
- equality up to permutations of N and X.

The three conditions on the functions and the four equivalence relations can be paired in 3 × 4 = 12 ways.

The twelve problems of counting equivalence classes of functions do not involve the same difficulties, and there is not one systematic method for solving them.

Two of the problems are trivial (the number of equivalence classes is 0 or 1), five problems have an answer in terms of a multiplicative formula of n and x, and the remaining five problems have an answer in terms of combinatorial functions (Stirling numbers and the partition function for a given number of parts).

The incorporation of classical enumeration problems into this setting is as follows.

- Counting n-permutations (i.e., partial permutations or sequences without repetition) of X is equivalent to counting injective functions N → X.
- Counting n-combinations of X is equivalent to counting injective functions N → X up to permutations of N.
- Counting permutations of the set X is equivalent to counting injective functions N → X when n = x, and also to counting surjective functions N → X when n = x.
- Counting multisets of size n (also known as n-combinations with repetitions) of elements in X is equivalent to counting all functions N → X up to permutations of N.
- Counting partitions of the set N into x subsets is equivalent to counting all surjective functions N → X up to permutations of X.
- Counting compositions of the number n into x parts is equivalent to counting all surjective functions N → X up to permutations of N.

## Viewpoints

The various problems in the twelvefold way may be considered from different points of view.

### Balls and boxes

Traditionally many of the problems in the twelvefold way have been formulated in terms of placing balls in boxes (or some similar visualization) instead of defining functions.

The set N can be identified with a set of balls, and X with a set of boxes; the function ƒ : N → X then describes a way to distribute the balls into the boxes, namely by putting each ball a into box ƒ(a).

Thus the property that a function ascribes a unique image to each value in its domain is reflected by the property that any ball can go into only one box (together with the requirement that no ball should remain outside of the boxes), whereas any box can accommodate (in principle) an arbitrary number of balls.

Requiring in addition ƒ to be injective means forbidding to put more than one ball in any one box, while requiring ƒ to be surjective means insisting that every box contain at least one ball.

Counting modulo permutations of N or X (or both) is reflected by calling the balls or the boxes, respectively, "indistinguishable".

This is an imprecise formulation (in practice individual balls and boxes can always be distinguished by their location, and one could not assign different balls to different boxes without distinguishing them), intended to indicate that different configurations are not to be counted separately if one can be transformed into the other by some interchange of balls or of boxes.

This possibility of transformation is formalized by the action by permutations.

### Sampling

Another way to think of some of the cases is in terms of sampling, in statistics.

Imagine a population of X items (or people), of which we choose N. Two different schemes are normally described, known as "sampling with replacement" and "sampling without replacement".

In the former case (sampling with replacement), once we've chosen an item, we put it back in the population, so that we might choose it again.

The result is that each choice is independent of all the other choices, and the set of samples is technically referred to as independent identically distributed.

In the latter case, however, once we've chosen an item, we put it aside so that we can't choose it again.

This means that the act of choosing an item has an effect on all the following choices (the particular item can't be seen again), so our choices are dependent on one another.

In the terminology below, the case of sampling with replacement is termed "Any f", while the case of sampling without replacement is termed "Injective f".

Each box indicates how many different sets of choices there are, in a particular sampling scheme.

The row labeled "Distinct" means that the ordering matters.

For example, if we have ten items, of which we choose two, then the choice (4,7) is different from (7,4).

On the other hand, the row labeled "Sn orders" means that ordering doesn't matter: Choice (4,7) and (7,4) are equivalent.

(Another way to think of this is to sort each choice by the item number, and throw out any duplicates that result.)

In terms of probability distributions, sampling with replacement where ordering matters is comparable to describing the joint distribution of N separate random variables, each with an X-fold categorical distribution.

The case where ordering doesn't matter, however, is comparable to describing a single multinomial distribution of N draws from an X-fold category, where only the number seen of each category matters.

The case where ordering doesn't matter and sampling is without replacement is comparable to a single multivariate hypergeometric distribution, and the fourth possibility does not seem to have a correspondence.

Note that in all the "injective" cases (i.e. sampling without replacement), the number of sets of choices is zero unless N ≤ X.

("Comparable" in the above cases means that each element of the sample space of the corresponding distribution corresponds to a separate set of choices, and hence the number in the appropriate box indicates the size of the sample space for the given distribution.)

From this perspective, the case labeled "Surjective f" is somewhat strange: Essentially, we keep sampling with replacement until we've chosen each item at least once.

Then, we count how many choices we've made, and if it's not equal to N, throw out the entire set and repeat.

This is vaguely comparable to the coupon collector's problem, where the process involves "collecting" (by sampling with replacement) a set of X coupons until each coupon has been seen at least once.

Note that in all "surjective" cases, the number of sets of choices is zero unless N ≥ X.

### Labelling, selection, grouping

A function ƒ : N → X can be considered from the perspective of X or of N. This leads to different views:

- the function ƒ labels each element of N by an element of X.
- the function ƒ selects (chooses) an element of the set X for each element of N, a total of n choices.
- the function ƒ groups the elements of N together that are mapped to the same element of X.

These points of view are not equally suited to all cases.

The labelling and selection points of view are not well compatible with permutation of the elements of X, since this changes the labels or the selection; on the other hand the grouping point of view does not give complete information about the configuration unless the elements of X may be freely permuted.

The labelling and selection points of view are more or less equivalent when N is not permuted, but when it is, the selection point of view is more suited.

The selection can then be viewed as an unordered selection: a single choice of a (multi-)set of n elements from X is made.

### Labelling and selection with or without repetition

When viewing ƒ as a labelling of the elements of N, the latter may be thought of as arranged in a sequence, and the labels as being successively assigned to them.

A requirement that ƒ be injective means that no label can be used a second time; the result is a sequence of labels without repetition.

In the absence of such a requirement, the terminology "sequences with repetition" is used, meaning that labels may be used more than once (although sequences that happen to be without repetition are also allowed).

For an unordered selection the same kind of distinction applies.

If ƒ must be injective, then the selection must involve n distinct elements of X, so it is a subset of X of size n, also called an n-combination.

Without the requirement, a same element of X may occur multiple times in the selection, and the result is a multiset of size n of elements from X, also called an n-multicombination or n-combination with repetition.

In these cases the requirement of a surjective ƒ means that every label is to be used at least once, respectively that every element of X be included in the selection at least once.

Such a requirement is less natural to handle mathematically, and indeed the former case is more easily viewed first as a grouping of elements of N, with in addition a labelling of the groups by the elements of X.

### Partitions of sets and numbers

When viewing ƒ as a grouping of the elements of N (which assumes one identifies under permutations of X), requiring ƒ to be surjective means the number of groups must be exactly x.

Without this requirement the number of groups can be at most x.

The requirement of injective ƒ means each element of N must be a group in itself, which leaves at most one valid grouping and therefore gives a rather uninteresting counting problem.

When in addition one identifies under permutations of N, this amounts to forgetting the groups themselves but retaining only their sizes.

These sizes moreover do not come in any definite order, while the same size may occur more than once; one may choose to arrange them into a weakly decreasing list of numbers, whose sum is the number n. This gives the combinatorial notion of a partition of the number n, into exactly x (for surjective ƒ) or at most x (for arbitrary ƒ) parts.

## Formulas

Formulas for the different cases of the twelvefold way are summarized in the following table; each table entry links to a subsection below explaining the formula.

The particular notations used are:

### Intuitive meaning of the rows and columns

This is a quick summary of what the different cases mean.

The cases are described in detail below.

Then the columns mean:

And the rows mean:

- Distinct: Leave the lists alone; count them directly.
- Sn orbits: Before counting, sort the lists by the item number of the items chosen, so that order doesn't matter, e.g. (5,2,10), (10,2,5), (2,10,5), etc. all → (2,5,10).
- Sx orbits: Before counting, renumber the items seen so that the first item seen has number 1, the second 2, etc. Numbers may repeat if an item was seen more than once, e.g. (3,5,3), (5,2,5), (4,9,4), etc. → (1,2,1) while (3,3,5), (5,5,3), (2,2,9), etc. → (1,1,2).
- Sn×Sx orbits: Before counting, sort the lists and then renumber them, as described above. (Note: Renumbering then sorting will produce different lists in some cases, but the number of possible lists does not change.)

### Details of the different cases

The cases below are ordered in such a way as to group those cases for which the arguments used in counting are related, which is not the ordering in the table given.

#### Functions from N to X

This case is equivalent to counting sequences of n elements of X with no restriction: a function f : N → X is determined by the n images of the elements of N, which can each be independently chosen among the elements of x.

This gives a total of x possibilities.

#### Injective functions from N to X

This case is equivalent to counting sequences of n distinct elements of X, also called n-permutations of X, or sequences without repetitions; again this sequence is formed by the n images of the elements of N. This case differs from the one of unrestricted sequences in that there is one choice fewer for the second element, two fewer for the third element, and so on.

Therefore instead of by an ordinary power of x, the value is given by a falling factorial power of x, in which each successive factor is one fewer than the previous one.

The formula is

Note that if n > x then one obtains a factor zero, so in this case there are no injective functions N → X at all; this is just a restatement of the pigeonhole principle.

#### Injective functions from N to X, up to a permutation of N

#### Functions from N to X, up to a permutation of N

This case is equivalent to counting multisets with n elements from X (also called n-multicombinations).

The reason is that for each element of X it is determined how many elements of N are mapped to it by f, while two functions that give the same such "multiplicities" to each element of X can always be transformed into another by a permutation of N. The formula counting all functions N → X is not useful here, because the number of them grouped together by permutations of N varies from one function to another.

Rather, as explained under combinations, the number of n-multicombinations from a set with x elements can be seen to be the same as the number of n-combinations from a set with x + n − 1 elements.

This reduces the problem to another one in the twelvefold way, and gives as result

#### Surjective functions from N to X, up to a permutation of N

This case is equivalent to counting multisets with n elements from X, for which each element of X occurs at least once.

This is also equivalent to counting the compositions of n with x (non-zero) terms, by listing the multiplicities of the elements of x in order.

The correspondence between functions and multisets is the same as in the previous case, and the surjectivity requirement means that all multiplicities are at least one.

By decreasing all multiplicities by 1, this reduces to the previous case; since the change decreases the value of n by x, the result is

Note that when n < x there are no surjective functions N → X at all (a kind of "empty pigeonhole" principle); this is taken into account in the formula, by the convention that binomial coefficients are always 0 if the lower index is negative.

The same value is also given by the expression

The form of the result suggests looking for a manner to associate a class of surjective functions N → X directly to a subset of n − x elements chosen from a total of n − 1, which can be done as follows.

First choose a total ordering of the sets N and X, and note that by applying a suitable permutation of N, every surjective function N → X can be transformed into a unique weakly increasing (and of course still surjective) function.

If one connects the elements of N in order by n − 1 arcs into a linear graph, then choosing any subset of n − x arcs and removing the rest, one obtains a graph with x connected components, and by sending these to the successive elements of X, one obtains a weakly increasing surjective function N → X; also the sizes of the connected components give a composition of n into x parts.

This argument is basically the one given at stars and bars, except that there the complementary choice of x − 1 "separations" is made.

#### Injective functions from N to X, up to a permutation of X

#### Injective functions from N to X, up to permutations of N and X

#### Surjective functions from N to X, up to a permutation of X

#### Surjective functions from N to X

#### Functions from N to X, up to a permutation of X

#### Surjective functions from N to X, up to permutations of N and X

#### Functions from N to X, up to permutations of N and X

### Extremal cases

The above formulas give the proper values for all finite sets N and X.

In some cases there are alternative formulas which are almost equivalent, but which do not give the correct result in some extremal cases, such as when N or X are empty.

The following considerations apply to such cases.

- For every set X there is exactly one function from the empty set to X (there are no values of this function to specify), which is always injective, but never surjective unless X is (also) empty.
- For every non-empty set N there are no functions from N to the empty set (there is at least one value of the function that must be specified, but it cannot).
- When n > x there are no injective functions N → X, and if n < x there are no surjective functions N → X.
- The expressions used in the formulas have as particular values

## Generalizations

### The twentyfold way

Another generalization called the twentyfold way was developed by Kenneth P. Bogart in his book "Combinatorics Through Guided Discovery".

In the problem of distributing objects to boxes both the objects and the boxes may be identical or distinct.

Bogart identifies 20 cases.

Credits to the contents of this page go to the authors of the corresponding Wikipedia page: en.wikipedia.org/wiki/Twelvefold way.