The voting problem stated at the start of this post asks for the probability that a random sequence containing exactly fifty $1$’s and fifty $2$’s is ballot. So, let’s count how many ballot sequences of length $2n$ have the same number of $1$’s and $2$’s. Such words are also called **Dyck words**, and are in natural bijection with **Dyck paths** of height $n$, which are the lattice paths from $(0,0)$ to $(n,n)$ that stay weakly above the diagonal line $x=y$ in the plane.

To create a Dyck path from a Dyck word, read the word from right to left, and, starting from $(0,0)$, move up one step for each $1$ that you read, and right one step for each $2$ that you read. For instance, the word $22121121$ corresponds to the Dyck path:

Now, it is well-known that the number of Dyck paths of height $n$ is the $n$-th Catalan number $C_n$, and that an explicit formula for the Catalan numbers is given by $C_n=\frac{1}{n+1}\binom{2n}{n}$.

For the voting problem with 100 voters, it follows that there are $C_50=\frac{1}{51}\binom{100}{50}$ ballot sequences, out of a sample space of $\binom{100}{50}$ total possible sequences of votes. Therefore, the probability that one candidate is never ahead of the other in the count is $$\frac{1}{51}\binom{100}{50}/\binom{100}{50}=\frac{1}{51}.$$

## Counting ballot sequences recursively

We now give a more elementary way, without using crystals, to count the total number of two-letter ballot sequences of length $2n$ or $2n+1$ (with any number of $1$’s and $2$’s). Let $B_{m}$ be the number of two-letter ballot sequences of length $m$ for any $m$.

We first show that $$B_{2n}=2B_{2n-1}$$ for all $n\ge 1$. Indeed, any ballot sequence of length $2n-1$ has at least one more $1$ than $2$ since there are an odd number of letters total. Thus, we may place either a $1$ or $2$ at the start of the word to result in another ballot sequence of length $2n$. Conversely, given a ballot sequence of length $2n$, its final $2n-1$ letters form a ballot word. This gives a two-to-one correspondence between the ballot words of length $2n$ and those of length $2n-1$, so the claim holds.

Next, we show that $$B_{2n+1}=2B_{2n}-C_n$$ for all $n\ge 0$. Indeed, to form a ballot sequence of length $2n+1$, we may either add a $1$ or $2$ to the start of a ballot sequence of length $2n$ that has strictly more $1$’s than $2$’s as above, or we may add a $1$ (but not a $2$!) to a ballot sequence that has exactly the same number of $1$’s as $2$’s. Since the latter type are Dyck words, counted by the $n$-th Catalan number, we have $$B_{2n+1}=2\cdot (B_{2n}-C_n)+1\cdot C_n=2B_{2n}-C_n.$$

Finally, using the explicit formula $C_n=\frac{1}{n+1}\binom{2n}{n}$, we can check that our guesses $B_{2n}=\binom{2n}{n}$ and $B_{2n+1}=\binom{2n+1}{n}$ satisfy these recursions. Indeed, we have $$\begin{eqnarray*}

\binom{2n}{n}&=&\frac{(2n)!}{n!n!} \\

&=&\frac{2n}{n}\frac{(2n-1)!}{(n-1)!n!} \\

&=&2\binom{2n-1}{n-1}, \end{eqnarray*}$$ so the formulas satisfy the recursion $B_{2n}=2B_{2n-1}$. We also have $$\begin{eqnarray*}2\binom{2n}{n}-\frac{1}{n+1}\binom{2n}{n}&=&\frac{2(n+1)-1}{n+1}\binom{2n}{n} \\ &=&\frac{(2n+1)(2n)!}{(n+1)n!n!} \\ &=&\binom{2n+1}{n}\end{eqnarray*}$$ so the formulas satisfy $B_{2n+1}=2B_{2n}-C_n$. Since we have already checked the initial conditions, the result follows.

## Corollary: A proof of the Catalan explicit formula

Notice that our crystal proof on the first page did not at all depend on the explicit formula for the Catalan numbers, but our second proof did. We can therefore combine these two methods to give a crystal-theoretic proof of the beautiful formula $C_n=\frac{1}{n+1}\binom{2n}{n}$.

Very nice post! Is there a proof of the “hook length formula” using cystals?