The problems are from the Book ** Jaynes, Edwin T. Probability theory: The logic of science. Cambridge university press, 2003.** Chapter 3 of the book is on the topic of

**. These problems can be easily solved using the symbolic method of analytical combinatorics.**

*Elementary sampling theory*## De-arrangement Problem

For more information on de-arrangement problem, please refer to https://en.wikipedia.org/wiki/Derangement.

In high school, we can solve this via ** Inclusion–exclusion principle**. But now we have better tools: symbolic method of analytical combinatorics.

Permutations can be thought of a set of cycles. I first read this idea from Page 261 ** Knuth, Donald Ervin, and Oren Patashnik. Concrete mathematics: a foundation for computer science. Addison-Wesley, 2003**, and fully developed in

**for exponential generating functions.**

*Flajolet, Philippe, and Robert Sedgewick. Analytic combinatorics. cambridge University press, 2009*De-arrangement just adds constraints on the above symbolic construction: the inner cycles have to contain more than 1 element, and we have only M urns. So we have D(z) = SET(CYL_{>1}(z))=SET(CYL_2(z) + CLY_3(z)+\dots).It is quite straightforward to get the exponential generating function:

D(z) = e^{\frac{z^2}{2}+\frac{z^3}{3}+\dots} = exp(\ln(\frac{1}{1-z})-z)=\frac{e^{-z}}{1-z}The above generating function can be taken as e^{-z} times \frac{1}{1-z}, and the final sequence should be the convolution of the two parts’ corresponding sequence: [z^M]D(z) = [z^M]e^{-z} \circledast [z^M]\frac{1}{1-z} = \frac{(-1)^M}{M!} \circledast \{1,1,1,\dots\}. Convolution with const sequence of 1 is just partial summation of the original sequence. So we have

[z^M]D(z) = \sum_{k=0}^{M}\frac{(-1)^k}{k!}Since we are using exponential generating function, we get the total numbers of de-arrangement when there are M urns is M!\sum_{k=0}^{M}\frac{(-1)^k}{k!}, so the probability that we have at least one ball matching the urn (not de-arrangement) is

1 - M!\sum_{k=0}^{M}\frac{(-1)^k}{k!} / M! = \sum_{k=1}^{M}\frac{(-1)^{k+1}}{k!}Based on the above analysis, it is easy to show the asymptotic result: 1-\frac{1}{e}.

## Word model

The balls in urns model can be taken as a sequence with length M of set. So if there are no constraints, the symbolic method is W_M(z) = SEQ_M(SET(z)) = (e^z)^M=e^{Mz} . That is why we have N![z^N]W_M(z) = M^N ways.

Now consider one constraint: at least on ball in each urn. This will change the above symbolic method’s inner constructor to SET_{\ge 1}(z). Thus we will get a new exponential generating function:

B_M(z) = SEQ_M(SET_{\ge 1}(z)) = (e^z-1)^M.

This leads to the number of possibility to be:

N![z^N]B_M(z) = M! \genfrac\{\}{0pt}{0}{N}{M}And the probability is:

\frac{M! \genfrac\{\}{0pt}{0}{N}{M}}{M^N}If M > N, Stirling number of the second kind is 0, and this is consistent with our common sense.

### Python script to check

We can use the following python script to verify the result roughly:

```
computed = {}
import math
from random import randint
def sterling2(n, k):
key = str(n) + "," + str(k)
if key in computed.keys():
return computed[key]
if n == k == 0:
return 1
if (n > 0 and k == 0) or (n == 0 and k > 0):
return 0
if n == k:
return 1
if k > n:
return 0
result = k * sterling2(n - 1, k) + sterling2(n - 1, k - 1)
computed[key] = result
return result
def prob(M, N):
return math.factorial(M) * sterling2(N,M) / (M**N)
def mc_sim(M, N, r=1000000):
def test_once():
s = set(randint(1, M) for i in range(N))
return int(len(s) == M)
return float(sum(test_once() for i in range(r)))/r
if __name__ == "__main__":
M = 5
for N in range(5, 10):
p_acc = prob(M, N)
p_sim = mc_sim(M, N)
diff = abs(p_acc - p_sim) / (p_acc) * 100
print("For M=%s, N=%s, Formula computes: %.4lf, Simluation shows: %.4lf, Relative Diff is %.4lf%%" % (M, N, p_acc, p_sim, diff))
```

And a demo run output is shown below:

```
For M=5, N=5, Formula computes: 0.0384, Simluation shows: 0.0385, Relative Diff is 0.3880%
For M=5, N=6, Formula computes: 0.1152, Simluation shows: 0.1150, Relative Diff is 0.2014%
For M=5, N=7, Formula computes: 0.2150, Simluation shows: 0.2155, Relative Diff is 0.2316%
For M=5, N=8, Formula computes: 0.3226, Simluation shows: 0.3234, Relative Diff is 0.2536%
For M=5, N=9, Formula computes: 0.4271, Simluation shows: 0.4266, Relative Diff is 0.1052%
```

### Math proof

Then we can have

N! [z^N]B_M(z) = \sum_{m=0}^M{M \choose m}(-1)^{(M-m)} m^NThen we just need to prove

\sum_{m=0}^M (-1)^{(M-m)} \frac{m^N}{m!(M-m)!} = \genfrac\{\}{0pt}{0}{N}{M}The above equation is just equation 6.19 of Knuth’s concrete mathematical. It is easy to verify the equation satisfies the following recurrent relation:

\genfrac\{\}{0pt}{0}{N}{M} = M \genfrac\{\}{0pt}{0}{N-1}{M} + \genfrac\{\}{0pt}{0}{N-1}{M-1}Pay more attention to the boundary values, we know it is just the Stirling number of second kind.