In my last post, we saw that the number of positive up-down walks (going either up by $1$ or down by $1$ at each step) of length $2n$ starting from $(0,0)$ is $\binom{2n}{n}$. Furthermore, the number of these that return to height $0$ at the end is the $n$th Catalan number $\frac{1}{n+1}\binom{2n}{n}$. It is natural to ask, then: how many positive walks of length $2n$ end at height $h$ for a given $h$?

This question was brought up at the dinner table last week by Carlos Shine, who leads the Brazilian IMO team and was also one of my coworkers at the IdeaMath summer camp last week in San Jose. In classic mathematician style, we asked the waiter for a pen and began scribbling on the nearest napkin.

## Table of values and a recursion

The first thing we noticed was that the ending height is always even, since we change the parity by $1$ at each of the $2n$ steps. So let the ending height be $2k$, and define $f(n,k)$ to be the number of positive up-down walks of length $2n$ ending at height $2k$. We can simply list out the possibilities for the first few values of $n$ to make the following table of values of $f(n,k)$.

$$\begin{array}{c|ccccc}

k: & 0 & 1 & 2 & 3 & 4 \\\hline

n=0 & 1 & & & & \\

n=1 & 1 & 1 & & & \\

n=2 & 2 & 3 & 1 & & \\

n=3 & 5 & 9 & 5 & 1 & \\

n=4 & 14& 28& 20& 7 & 1 \\

\end{array}

$$ As expected, we have $f(n,0)=\frac{1}{n+1}\binom{2n}{n}$ and $\sum_{k}f(n,k)=\binom{2n}{n}$. The pattern $f(n,n)=1$ is explained by the fact that we must take $2n$ up-steps to reach height $2n$.

For the other values in the table, Carlos pointed out that they appear to satisfy the recurrence relation $$f(n,k)=f(n-1,k-1)+2f(n-1,k)+f(n-1,k+1)$$ for $k\ge 1$, and for $k=0$, $f(n,0)=f(n-1,0)+f(n-1,1)$. Indeed, we can prove this by considering the last two steps in a walk to height $2k$. If the last two steps are both up, then the number of such walks is the number of positive walks of length $2n-2$ to height $2k-2$, or $f(n-1,k-1)$. If the last two steps are up down or down up, then in each case we get a count of $f(n-1,k)$, and if the last two steps are both down, we get $f(n-1,k+1)$. The exception to this rule is when $k=0$ and so the last two steps could not have been down up or up up; in this case we have $f(n,0)=f(n-1,0)+f(n-1,1)$ as desired.

## An explicit formula and a bijective proof

With a bit of guesswork and by unwrapping the recursion for $k=n-1,n-2,\ldots$, we can find that $$f(n,n-k)=\binom{2n}{k}-\binom{2n}{k-1}$$ satisfies the recursion. Indeed, it is a natural generalization of the closed formula for the Catalan numbers, which can also be written $$f(n,0)=\binom{2n}{n}-\binom{2n}{n-1}.$$

A few days later, Carlos noticed that we can prove this formula bijectively using the *reflection principle*. In an email, he explained:

“…I remembered the proof of the Catalan numbers using the reflection principle and thought, well, why don’t we try it for the generalized version? So here’s a more direct counting proof. I’m going to change the notation: let $g(n,k) = f(n,n-k)$. In $g(n,k)$, $k$ is the number of “down steps”, because, in order to get to the point $(2n,2(n-k))$, you need $k$ steps down and $2n-k$ steps up.

We are going to prove that

$$g(n,k) = \binom{2n}k – \binom{2n}{k-1}.$$If we do not restrict the paths (that is, they are allowed to go below the $x$-axis), there are $\binom{2n}k$ choices ($k$ steps down, $2n-k$ steps up). We have to subtract the number of paths that go below the $x$-axis. Consider (quite like your previous post) the first time it hits the line $y=-1$, at point $P=(t,-1)$ and then reflect the path before $P$, that is, the points $(0,0), (1,a_1), \ldots, (t-1,0)$, about the line $y=-1$. For example:

bijects to

This bijects to a path from $(0,-2)$ to $(2n,2(n-k))$, which has the same number of steps up and down as a path from $(0,0)$ to $(2n,2(n-k+1))$, which has $k-1$ steps down and $2n-k+1$ steps up. In fact, any path from $(0,-2)$ to $(2n,2(n-k))$ has to hit the line $y=-1$, so the process is invertible. The number of such paths is $\binom{2n}{k-1}$. The result then follows.”

Beautiful!

## The distribution

Now that we have a formula, can we predict the asymptotic behaviour of the distribution of values as $n\to \infty$? The values appear to rise to a peak and then decrease again. Below is a plot of $g(100,k)$ for $k=0,\ldots,100$. Is it a normal distribution?

In fact, it turns out to approach the *derivative* of a normal distribution. The binomial distribution, that is, the discrete distribution having density function $B_{n}(x)=\frac{1}{2^n}\binom{n}{x}$, approaches a normal distribution as $n\to \infty$, having mean $n$ and standard deviation $\sqrt{n}/2$. Using the formula for a normal distribution and substituting $2n$ for $n$, we have that $\binom{2n}{x}$ approaches the function $$\frac{4^n}{\sqrt{\pi n}}e^{-(x-n)^2/n}.$$

Now, our formula for $g(n,k)$ is a “discrete derivative” of the sequence of binomial coefficients $\binom{2n}{x}$, so it will be approximated by the continuous derivative of the function defined above. This derivative comes out to:

$$4^n\cdot \frac{2(n-x)}{\sqrt{\pi n^3}}e^{-(x-n)^2/n}$$

Indeed, the plot for $n=200$ matches fairly closely!

## A generating function

As a generatingfunctionologist, I can’t help but wrap up this post with a generating function for $g(n,k)$. Since $g(n,k)$ is the difference of consecutive coefficients of the generating function $$(1+x)^{2n}=\sum_{k}\binom{2n}{k}x^k,$$ we have that the first $n+1$ coefficients of $$(1-x)(1+x)^{2n}$$ are the values of $g(n,k)=\binom{2n}{k}-\binom{2n}{k-1}$ for $k=0,\ldots,n$.

Hi Maria,

That is a nice bijection indeed. However, there is no need to require that the number of steps be even. Let $h(n,k) = binom{n}{k} – binom{n}{k-1}$ for $n, k ge 0$. Then $h(n,k)$ is the number of Dyck paths ending at height $n-2k$ for $n ge 2k$, and here the recurrence is $h(n,k) = h(n-1,k) + h(n-1,k-1)$. This is the same recurrence as for binomial coefficients, the only difference being the initial condition $h(0,1) = -1$. The generating function for $h(n,k) x^n y^k$ is $frac{1-y}{1-x(1+y)}$.

Also, if I am not mistaken the standard deviation of the normal distribution is $sqrt{n}/2$.

Steven

That’s true – thanks for pointing that out!

I’ve also edited the typo.

Maria