Pythagorean triples on a sphere?

It’s 8/15/17, which means it’s time to celebrate! The three numbers making up the date today form a Pythagorean triple, a triple of positive integers $(a,b,c)$ with $a^2+b^2=c^2$. Indeed, $8^2+15^2=64+225=289=17^2$.

Alternatively, by the Pythagorean theorem, a Pythagorean triple is any triple of positive integers which make up the sides of a right triangle:

It’s exciting when all three sides are integers, since many common right triangles’ lengths involve square roots: $(1,1,\sqrt{2})$, $(1,2,\sqrt{5})$, and $(1,\sqrt{3},2)$, to name a few. And these sides aren’t even rational, which the poor Pythagorean scholar Hippasus discovered by proving that $\sqrt{2}$ is irrational and was subsequently drowned to death by his colleagues, according to some historical accounts. So the ancient Pythagoreans in fact only believed in right triangles having rational side lengths.

Of course, given one Pythagorean triple, like $(3,4,5)$, we can construct infinitely many by scaling the sides: $(6,8,10)$, $(9,12,15)$, etc. (In fact, 12/9/15 was the previous Pythagorean triple day, and 12/16/20 will be the next.) So to classify all Pythagorean triples, it suffices to find the reduced triples, those with no common factors greater than $1$.

So the last reduced Pythagorean triple day was back in 2013 on 12/5/13, and the next one won’t be until 7/24/25!

Constructing Pythagorean triples

It’s well known that there are infinitely many reduced Pythagorean triples as well. One beautiful, famous proof of this derives a parameterization of all triples via geometric means:

Consider the unit circle and the point $P=(-1,0)$ on the circle. Let $Q=(0,r/s)$ be a rational point on the $y$-axis inside the circle. Then line $PQ$ intersects the circle at a second point $R$, and it turns out $R$ has rational coordinates as well. Some simple algebra with similar triangles (try it, if you haven’t seen it before!) gives us $$R=\left(\frac{r^2-s^2}{r^2+s^2},\frac{2rs}{r^2+s^2}\right).$$ But since $R$ lies on the unit circle, if $(x,y)$ are the coordinates of $R$ then $x^2+y^2=1$, and substituting and clearing denominators we have $$(r^2-s^2)^2+(2rs)^2=(r^2+s^2)^2$$ (which can also be checked with direct algebraic computation). It follows that $(r^2-s^2,2rs,r^2+s^2)$ is a Pythagorean triple for any integers $r$ and $s$, giving us infinitely many Pythagorean triples. And in fact, for $r$ and $s$ relatively prime of different parity, these triples are reduced.

Spherical considerations

Given that this is a global day to celebrate (assuming you use the standard world calendar), and the Earth is a sphere, I have to wonder whether Pythagorean triples actually exist if drawn on a perfect-sphere approximation of the Earth. First, we’d have to define what we even mean by that – are we measuring in meters? In feet? And what is a right triangle on the surface of a sphere anyway?

The most common definition of a triangle on a sphere is one formed by great circles. A great circle is any circle of maximal diameter around a sphere, in other words, whose plane contains the center of the sphere. So, the equator, the prime meridian, and all longitude lines are great circles on the Earth, but latitude lines are not. A line segment on a sphere is a segment of a great circle, and a (spherical) triangle is a shape formed by three points connected by three (spherical) line segments. The angle between two great circles that intersect is the angle between their planes. (Thanks to Wikipedia for the excellent image below.)

Since our world now has size rather than being a flat plane, let’s set the radius of the Earth to be $1$ unit for simplicity. So we’re working with triangles $ABC$ with a right angle at $C$, and asking when they have rational lengths:

Are there any Pythagorean triples on the unit sphere? Are there infinitely many?

These questions suddenly aren’t very easy. If we scale our sphere down by $\pi/2$ we can get at least one, by taking the equator, the prime meridian, and the $90^\circ$ longitudinal line. This forms a right triangle with all three lengths equal (and all angles right!) and so we can simply scale the Earth to make it any rational lengths we want. But this still doesn’t answer anything about the unit sphere.

There is some hope, however. In this paper by Hartshorne and Van Luijk, the authors show that there are infinitely many Pythagorean triples in the hyperbolic plane, using the Poincare Disk model and some nice hyperbolic trig formulas combined with some Eulerian number theory tricks. So Pythagorean triples are not the sole property of flat Euclidean space.

In addition, there is a “spherical Pythagorean theorem”, which in our notation, since the radius of our sphere is $1$, says that $$\cos(c)=\cos(a)\cos(b)$$ where $a,b,c$ are the side lengths of the triangle and $c$ is opposite the right angle. And yet, this offers little insight into whether this equation has rational solutions. Trig functions are generally much harder to deal with than polynomials, especially when it comes to solving equations over the rationals.

So, if you have any insights on this problem (or references to papers that have worked on it – I couldn’t find any in an initial search, but I am not very familiar with it), please let me know in the comments! And until then, happy reduced-Pythagorean-triple-in-flat-Euclidean-space day!

A faster divisibility rule for $7$ (and $13$)

I occasionally teach evening online classes for Art of Problem Solving (, a fantastic resource for high school students interested in learning mathematics), and one of the lessons I taught recently was on fast mental arithmetic tricks for testing divisibility by various small integers.

How do you tell if a number is divisible by $2$? Easy: look at the last digit. If that digit is even, the whole number is divisible by $2$, otherwise it’s not.

How do you tell if a number is divisible by $3$? Easy: Add up the digits and see if the sum is divisible by $3$. For instance, $1642$ is not divisible by $3$ because $1+6+4+2=13$ is not, but $1644$ is because $1+6+4+4=15$ is divisible by $3$.

The list goes on; there are nice, well-known divisibility rules for $4$, $5$, $6$, $8$, $9$, $10$, $11$, and $12$. But $7$ turns out to be rather annoying. Many of the existing texts on this subject use a recursive rule that goes something like this: Cross off the last digit, double that digit, and subtract it from the number that remains. Then keep doing that until you get a small number and see if the result is divisible by $7$.


Not only is this method somewhat cumbersome, the small number you get in the end does not tell you the remainder mod $7$ (see this post for an introduction to modular arithmetic), only whether or not that remainder is $0$. But it turns out that one can derive a different method along the same lines as the divisibility rule for $3$. (See this Wikipedia article for many, many divisibility rules!)

First, let’s recall the proof of the rule for $3$. Given a number in base ten, like $1642$, we can write it out as a sum of powers of $10$:

$$1642=1\cdot 10^3+6\cdot 10^2+4\cdot 10+2$$
Notice that $10$ has a remainder of $1$ when divided by $3$, so when looking at remainders mod $3$, we can replace all these $10$’s in the expression above with $1$’s:
$$1642\equiv 1+6+4+2 \pmod{3}$$ and we have recovered the sum of the digits expression.

Similarly, for mod $7$, we can write things out in terms of powers of $10$, and reduce these powers modulo $7$. The number $10$ itself is $3$ mod $7$, then $100\equiv 10^2\equiv 3^2$ is $2$ mod $7$. The next remainders, for $10^3,10^4, 10^5$ are $6,4,5$, and then the remainder when $10^6$ is divided by $7$ is $1$ again, wrapping around to the same value as $10^0$. Writing this out in a table, we see a pattern start to emerge:

n & 1 & 10 & 10^2 & 10^3 & 10^4 & 10^5 & 10^6 & 10^7 & 10^8 \\
n \bmod 7 & 1 & 3 & 2 & 6 & 4 & 5 & 1 & 3 & 2

Indeed, the remainders $1,3,2,6,4,5$ repeat every six steps, since at each step we are simply multiplying the previous remainder by $10$ and taking the result mod $7$. But we can make this pattern even simpler by replacing $6$ with $-1$, $4$ with $-3$, and $5$ with $-2$, which still have the same remainders mod $7$ (they differ by $7$ from the originals):

n & 1 & 10 & 10^2 & 10^3 & 10^4 & 10^5 & 10^6 & 10^7 & 10^8 \\
n \bmod 7 & 1 & 3 & 2 & -1 & -3 & -2 & 1 & 3 & 2

So, the pattern simply continues $1,3,2,-1,-3,-2,1,3,2,-1,-3,-2,\ldots$. We can therefore replace the powers of $10$ in a base ten expansion by these smaller digits to compute the number mod $7$. For instance, $$1642=10^3+6\cdot 10^2+4\cdot 10+2\equiv -1+6\cdot 2+4\cdot 3+2=25\equiv 4\pmod 7,$$ so $1642$ has a remainder of $4$ when divided by $7$.

As a larger example, starting with the number $133251140379$, we can make a simple table with the pattern $1,3,2,-1,-3,-2,\ldots$ underneath the digits from right to left (we reverse the direction since the powers of $10$ increase to the left):

1 & 3 & 3 & 2 & 5 & 1 & 1 & 4 & 0 & 3 & 7 & 9 \\
-2 & -3& -1& 2 & 3 & 1 & -2& -3& -1& 2 & 3 & 1

Then we can simply multiply the pairs of numbers in column together (and take the results mod 7) and add the results mod $7$. We can write each product below the pairs:

1 & 3 & 3 & 2 & 5 & 1 & 1 & 4 & 0 & 3 & 7 & 9 \\
-2 & -3& -1& 2 & 3 & 1 & -2& -3& -1& 2 & 3 & 1 \\\hline
-2 & -2& -3& 4 & 1 & 1 & -2& -5& 0 & 6 & 0 & 2

(Notice that in the computation of the products we sometimes automatically take the result mod $7$: for instance, $5\cdot 3=15\equiv 1\pmod 7$, so under the fifth column we simply write $1$.) Finally, the sum of the numbers in the bottom row is $-2-2-3+4+1+1-2-5+6+2=0$, so $133251140379$ is indeed divisible by $7$.

I tested this method by hand on a few large numbers and compared my calculation time to the iterative subtraction method, and it saves a good chunk of time. The particular number above took me 35 seconds to check with the $1,3,2$-pattern method, and 69 seconds to check with the double-and-subtract rule. Try it out, and let me know in the comments whether you find this method easier!

Addendum: A similar rule for $13$

While this method may become more cumbersome for larger moduli (which will have a longer pattern to memorize), it turns out that powers of $10$ mod $13$ also have an easily-memorizable repeating pattern:

n & 1 & 10 & 10^2 & 10^3 & 10^4 & 10^5 & 10^6 & 10^7 & 10^8 \\
n \bmod 13 & 1 & -3 & -4 & -1 & 3 & 4 & 1 & -3 & -4

The pattern is $1,-3,-4,-1,3,4$ repeating, which can be memorized by taking the length $3$ string $1,3,4$ and then writing negative signs on every other block of three entries starting at the tens digit. Let’s try it out on our example number above.

1 & 3 & 3 & 2 & 5 & 1 & 1 & 4 & 0 & 3 & 7 & 9 \\
4 & 3 &-1 &-4 &-3 & 1 & 4 & 3 & -1&-4 &-3 & 1 \\\hline
4 &-4 &-3 & 5 &-2 & 1 & 4 & -1& 0 & 1 & 5 & -4
We have $4-4-3+5-2+1+4-1+1+5-4=6$, so $133251140379$ has a remainder of $6$ when divided by $13$.

Ellipses, parabolas, and infinity

A parabola can be defined as the locus of points equidistant from a fixed point (called the focus) and a fixed line (called the directrix). But we know from projective geometry that parabolas and ellipses are essentially the same object. Does this mean we can also define an ellipse in terms of a point and a line?

While tutoring a high school student recently, we worked through a problem that was essentially showing just that in a special case. Remarkably:

An ellipse can be defined as the locus of points $P$ for which the distance $|PF|$ to a focus $F$ is $\alpha$ times the distance from $P$ to a fixed line $\ell$ for some positive real number $\alpha\lt 1$.

For instance, if we make $\alpha=1/2$ we would get an ellipse, whereas if $\alpha=1$ then it’s such a large ellipse that it’s a parabola!

Let’s check the math here. Recall that an ellipse is usually defined, synthetically, as the locus of points $P$ in the plane for which the sum of the distances $|PF_1|+|PF_2|$ for two fixed foci $F_1$ and $F_2$ is a fixed constant $s$. By translating this condition into coordinates, one can show that if we place an ellipse with its two lines of symmetry aligned with the $x$- and $y$-axes (with the wider part on the $x$-axis), centered at the origin, then the ellipse will have equation $$(x/b)^2+(y/a)^2=1$$ for some positive real numbers $a$ and $b$ with $a\le b$.

For an ellipse $E$ with this equation, the focii $F_1$ and $F_2$ have coordinates $(-c,0)$ and $(c,0)$ for some $c$. To find $c$ in terms of $a$ and $b$, we have that the sum of the distances of the point $(b,0)$ to the foci is $b-c+b-c=2b$, and the sum of the distances of the point $(0,a)$ to the foci is $2\sqrt{a^2+c^2}$ by the Pythagorean theorem. Therefore $2b=2\sqrt{a^2+c^2}$, and solving we find $c=\sqrt{b^2-a^2}$.

Now, I claim that if we set $\alpha=\frac{\sqrt{b^2-a^2}}{b}$ and let $\ell$ be the vertical line $x=b/\alpha$, then $E$ is the locus of points $P$ for which $|PF_2|$ is $\alpha$ times the distance from $P$ to $\ell$. Indeed, let $P=(x,y)$ be a point that has this property. Then the distance $|PF_2|$ is $\sqrt{(x-\sqrt{b^2-a^2})^2+y^2}$ and the distance from $P$ to $\ell$ is $\frac{b}{\alpha}-x$, so we have
\sqrt{(x-\sqrt{b^2-a^2})^2+y^2} &= \alpha\left(\frac{b}{\alpha}-x\right) \\
(x-\sqrt{b^2-a^2})^2+y^2 &= (b-\alpha x)^2 \\
x^2-2x\sqrt{b^2-a^2}+b^2-a^2+y^2 &= b^2-2x \sqrt{b^2-a^2} + (\alpha x)^2 \\
x^2-a^2+y^2 &= \frac{b^2-a^2}{b^2}x^2 \\
\frac{a^2}{b^2}x^2 + y^2 &= a^2 \\
\frac{x^2}{b^2} + \frac{y^2}{a^2} &= 1
which is indeed the equation of $E$.

A few noteworthy observations: first, it’s remarkable that the focus of the ellipse as defined in terms of the constant-sum-from-two-focii definition coincides with the focus that appears in the focus-and-directrix version. This makes one wonder if there is some natural relationship between the focii and this new directrix $\ell$, perhaps in terms of reciprocation (see this post for the definition of reciprocation in a circle.) And indeed, if we apply the transformation $(x,y)\mapsto (x/b,y/a)$, which maps our ellipse $E$ to the unit circle, the point $F_2$ maps to the point $(\alpha,0)$ and $\ell$ becomes the line $x=1/\alpha$, so indeed $F_2$ and $\ell$ form a reciprocal pair!

Second, consider the degenerate case of a circle, when $a=b$ and so $\alpha=0$. The condition $\alpha=0$ doesn’t really make sense unless we interpret the diagram in the projective plane and allow the directrix to be the line at infinity, which again makes the focus (the center of the circle) be the polar of this line.

Finally, consider the limit as $b$ approaches $\infty$, so that the ellipse is stretching out further and further until it becomes a parabola. (Exercise for the reader: What projective transformation maps an ellipse into a parabola?) In this limit we have $$\lim_{b\to \infty} \alpha = \lim_{b\to \infty} \frac{\sqrt{b^2-a^2}}{b}=\sqrt{\lim_{b\to \infty}1-(a^2/b^2)}=1,$$ and again we recover the case of a parabola. As a bonus, we find that the focus and directrix of a parabola must be reciprocal to each other across the parabola as well.

That’s all for today – a bit of fun with conics to kick off the new year. Happy and indivisible 2017!

What do Schubert curves, Young tableaux, and K-theory have in common? (Part I)

In a recent and fantastic collaboration between Jake Levinson and myself, we discovered new links between several different geometric and combinatorial constructions. We’ve weaved them together into a beautiful mathematical story, a story filled with drama and intrigue.

So let’s start in the middle.

Slider puzzles for mathematicians

Those of you who played with little puzzle toys growing up may remember the “15 puzzle”, a $4\times 4$ grid of squares with 15 physical squares and one square missing. A move consisted of sliding a square into the empty square. The French name for this game is “jeu de taquin”, which translates to “the teasing game”.

We can play a similar jeu de taquin game with semistandard Young tableaux. To set up the board, we need a slightly more general definition: a skew shape $\lambda/\mu$ is a diagram of squares formed by subtracting the Young diagram (see this post) of a partition $\mu$ from the (strictly larger) Young diagram of a partition $\lambda$. For instance, if $\lambda=(5,3,3,1)$ and $\mu=(2,1)$, then the skew shape $\lambda/\mu$ consists of the white squares shown below.


A semistandard Young tableau is then a way of filling the squares in such a skew shape with positive integers in such a way that the entries are weakly increasing across rows and strictly increasing down columns:


Now, an inner jeu de taquin slide consists of choosing an empty square adjacent to two of the numbers, and successively sliding entries inward into the empty square in such a way that the tableau remains semistandard at each step. This is an important rule, and it implies that, once we choose our inner corner, there is a unique choice between the squares east and south of the empty square at each step; only one can be slid to preserve the semistandard property. An example of an inwards jeu de taquin slide is shown (on repeat) in the animation below:


Here’s the game: perform a sequence of successive jeu de taquin slides until there are no empty inner corners left. What tableaux can you end up with? It turns out that, in fact, it doesn’t matter how you play this game! No matter which inner corner you pick to start the jeu de taquin slide at each step, you will end up with the same tableau in the end, called the rectification of the original tableau.


Since we always end up at the same result, it is sometimes more interesting to ask the question in reverse: can we categorize all skew tableaux that rectify to a given fixed tableau? This question has a nice answer in the case that we fix the rectification to be superstandard, that is, the tableau whose $i$th row is filled with all $i$’s:


It turns out that a semistandard tableau rectifies to a superstandard tableau if and only if it is Littlewood-Richardson, defined as follows. Read the rows from bottom to top, and left to right within a row, to form the reading word. Then the tableau is Littlewood-Richardson if every suffix (i.e. consecutive subword that reaches the end) of the reading word is ballot, which means that it has at least as many $i$’s as $i+1$’s for each $i\ge 1$.

For instance, the Littlewood-Richardson tableau below has reading word 352344123322111, and the suffix 123322111, for instance, has at least as many $1$’s as $2$’s, $2$’s as $3$’s, etc. Littlewood-Richardson tableaux are key to the Littlewood-Richardson rule, which allows us to efficiently compute products of Schur functions.


A convoluted commutator

The operation that Jake and I studied is a sort of commutator of rectification with another operation called “shuffling”. The process is as follows. Start with a Littlewood-Richardson tableau $T$, with one of the corners adjacent to $T$ on the inside marked with an “$\times$”. Call this extra square the “special box”. Then we define $\omega(T)$ to be the result of the following four operations applied to $T$.

  1. Rectification: Treat $\times$ as having value $0$ and rectify the entire skew tableau.
  2. Shuffling: Treat the $\times$ as the empty square to perform an inward jeu de taquin slide. The resulting empty square on the outer corner is the new location of $\times$.
  3. Un-rectification: Treat $\times$ as having value $\infty$ and un-rectify, using the sequence of moves from the rectification step in reverse.
  4. Shuffling back: Treat the $\times$ as the empty square to perform a reverse jeu de taquin slide, to move the $\times$ back to an inner corner.


We can iterate $\omega$ to get a permutation on all pairs $(\times,T)$ of a Littlewood-Richardson tableau $T$ with a special box marked on a chosen inner corner, with total shape $\lambda/\mu$ for some fixed $\lambda$ and $\mu$. As we’ll discuss in the next post, this permutation is related to the monodromy of a certain covering space of $\mathbb{RP}^1$ arising from the study of Schubert curves. But I digress.

One of the main results in our paper provides a new, more efficient algorithm for computing $\omega(T)$. In particular, the first three steps of the algorithm are what we call the “evacuation-shuffle”, and our local rule for evacuation shuffling is as follows:

  1. Phase 1. If the special box does not precede all of the $i$’s in reading order, switch the special box with the nearest $i$ prior to it in reading order. Then increment $i$ by $1$ and repeat this step.

    If, instead, the special box precedes all of the $i$’s in reading order, go to Phase 2.

  2. Phase 2. If the suffix of the reading word starting at the special box has more $i$’s than $i+1$’s, switch the special box with the nearest $i$ after it in reading order whose suffix has the same number of $i$’s as $i+1$’s. Either way, increment $i$ by $1$ and repeat this step until $i$ is larger than any entry of $T$.

So, to get $\omega(T)$, we first follow the Phase 1 and Phase 2 steps, and then we slide the special box back with a simple jeu de taquin slide. We can then iterate $\omega$, and compute an entire $\omega$-orbit, a cycle of its permutation. An example of this is shown below.


That’s all for now! In the next post I’ll discuss the beginning of the story: where this operator $\omega$ arises in geometry and why this algorithm is exactly what we need to understand it.

Deriving the discriminant of a cubic polynomial through analytic geometric means

This is a contributed gemstone, written by Sushant Vijayan. Enjoy!

Consider a generic cubic equation $$at^3+bt^2+ct+d=0,$$ where $a,b,c,d$ are real numbers and $a \neq 0$. Now we can transform this equation into a depressed cubic equation, i.e., one with no $t^2$ term, through means of Tschirnhaus Transformation $t=x-\frac{b}{3a}$, followed by dividing through by $a$. The depressed cubic equation is given by $$x^3+px+q=0$$ where $p$ and $q$ are related to $a,b,c,d$ by the relation given here. Setting $p=-m$ and $q=-n$ and rearranging we arrive at $$x^3 =mx+n \hspace{3cm} (1)$$ We will investigate the nature of roots for this equation. We begin with plotting out the graph of $y=x^3$:


It is an odd, monotonic, nondecreasing function with an inflection point at $x=0$. The real roots of the equation (1) are the $x$-coordinates of the points of intersection between the straight line $y= mx+n$ and the curve $y=x^3$. It is clear geometrically that however we draw the straight line, as a bare minimum, there would be at least one point of intersection. This corresponds to the fact that all cubic equations with real coefficients have at least one real root.

It is immediately clear that if the slope of the line $m$ is less than $0$, there is only one point of intersection and hence only one real root. On the other hand, the condition $m>0$ is equivalent to demanding that the depressed cubic $y(x)= x^3-mx-n$ has two points of local extrema (which is a necessary, but not sufficient, condition for the existence of three real roots).

Now the possibility of repeated real roots occur when the straight line is a tangent to the curve $y=x^3$. Hence consider the slope of the function $y=x^3$:
$$\frac{dy}{dx}=3x^2$$ Now equating the slopes (note $m\ge 0$) we get the tangent points:
$$x=\pm \sqrt{\frac {m}{3} } $$

Equivalently, the tangents are at the two values of $x$ for which
$$|x|=\sqrt{\frac {m}{3} }.$$

The corresponding $y$-intercepts for these tangent straight lines are:
$$n=\pm \frac{2m}{3}\sqrt{\frac{m}{3}}$$
or, in other words,


Thus for a straight line with a given slope $m\ge 0$ there are only two tangents with  corresponding tangent points and  $y$-intercepts. This would be the case of equation (1) having one real and one real repeated root.

What about the case where the slope is still $m$ but $|n|<\frac{2m}{3}\sqrt{\frac{m}{3}}$? bounded

In this case the straight line is parallel to the two tangent lines (since same slope) and in the region bounded by the two tangent lines. Hence it would necessarily intersect the curve $y=x^3$ at three points. This corresponds to the situation of equation (1) having three real roots.

And the case where $ |n| > \frac{2m}{3}\sqrt{\frac{m}{3}}$  corresponds to the area outside the  bounded region of  the two tangent lines and has only one point of intersection.


Hence the necessary and sufficient condition for three real roots (including repeated roots) is given by:
$$|n| \le \frac{2m}{3}\sqrt{\frac{m}{3}} \hspace{3cm}(2) $$

We note that the condition $m\ge 0$ is subsumed within the above condition (2), for If $m<0$ then the condition (2) cannot be satisfied. We note that condition (2) is not a radical expression. We square on both sides and rearrange to arrive at: $$ \frac{4m^3}{27}-n^2 \ge 0$$ Multiply by 27 and set $\bigtriangleup=4m^3-27n^2 $ to get $$ \bigtriangleup \ge 0 $$ The $\bigtriangleup $ is the discriminant of the cubic and has all the information required to determine the nature of the roots of the depressed cubic given by equation (1). This may then be written in terms of $a,b,c,d$ by inverting the Tschirnhaus transformation. A similar exercise may be carried out for the quartic equation, and a similar but albeit more complicated expression can be derived for the discriminant. It would be very interesting to do the same exercise for the quintic and see where it fails (fail it must, otherwise it would contradict Abel's famous result of insolvability of quintic equations using radical expressions).

Can you prove it?

As this ingenious post over at Math With Bad Drawings explained so clearly recently, there is a big difference between finding the answer to a math problem and being able to explain, beyond all reasonable doubt, why your answer is correct. It is the difference between solving it and proving it!

While mathematical proof is a huge part of math and science, it is unfortunately somewhat overlooked in the standard US curriculum. Partly for this reason, my family and I started a new math summer camp called Prove it! Math Academy. This year’s program will be a two-week crash course in proofs, using challenging problems and advanced mathematical concepts as examples.

Consider the following diagram which appears on our program website:


No explanation, no words, no proof. Just the picture. Enticing!

The first step in proving a fact is to state exactly what it is you are trying to prove. In this case, one might generalize/interpret this diagram as follows. Suppose you have a circle $O$ (the big white one) containing three circles $A$, $B$, and $C$ inside of it that are mutually externally tangent to each other and internally tangent to $O$. (These three circles $A$, $B$, and $C$ would be the large green, large yellow, and small red circle in the diagram above.) Now let $X$, $Y$, and $Z$ be the remaining green, yellow, and red circles respectively: for instance, $X$ is the circle tangent internally to $O$ and externally to $B$ and $C$. Then the lines $AX$, $BY$, and $CZ$ connecting the centers of these circles all meet at one point – they are concurrent.

You can construct this in Geogebra or Asymptote. You can vary the sizes of the circles and test that indeed, these lines always seem to be concurrent. But unlike in science, lots of evidence to support a fact is not good enough in mathematics. We want to prove beyond all reasonable doubt that this is always true, no matter what the sizes of the circles are. In other words, you can solve it, but can you prove it?

A very elegant solution to this problem, which appears on the next page of this post, is due to Michael Kural, one of my former students at MOP. He was intrigued by the image on our program flyer at a recent math tournament, and came up with a proof using only Desargues’ theorem, Monge’s theorem, homothety, and inversion. Read on!

Enumerating positive walks

In my last post, we saw that the number of positive up-down walks (going either up by $1$ or down by $1$ at each step) of length $2n$ starting from $(0,0)$ is $\binom{2n}{n}$. Furthermore, the number of these that return to height $0$ at the end is the $n$th Catalan number $\frac{1}{n+1}\binom{2n}{n}$. It is natural to ask, then: how many positive walks of length $2n$ end at height $h$ for a given $h$?


This question was brought up at the dinner table last week by Carlos Shine, who leads the Brazilian IMO team and was also one of my coworkers at the IdeaMath summer camp last week in San Jose. In classic mathematician style, we asked the waiter for a pen and began scribbling on the nearest napkin.


Table of values and a recursion

The first thing we noticed was that the ending height is always even, since we change the parity by $1$ at each of the $2n$ steps. So let the ending height be $2k$, and define $f(n,k)$ to be the number of positive up-down walks of length $2n$ ending at height $2k$. We can simply list out the possibilities for the first few values of $n$ to make the following table of values of $f(n,k)$.

k: & 0 & 1 & 2 & 3 & 4 \\\hline
n=0 & 1 & & & & \\
n=1 & 1 & 1 & & & \\
n=2 & 2 & 3 & 1 & & \\
n=3 & 5 & 9 & 5 & 1 & \\
n=4 & 14& 28& 20& 7 & 1 \\
$$ As expected, we have $f(n,0)=\frac{1}{n+1}\binom{2n}{n}$ and $\sum_{k}f(n,k)=\binom{2n}{n}$. The pattern $f(n,n)=1$ is explained by the fact that we must take $2n$ up-steps to reach height $2n$.

For the other values in the table, Carlos pointed out that they appear to satisfy the recurrence relation $$f(n,k)=f(n-1,k-1)+2f(n-1,k)+f(n-1,k+1)$$ for $k\ge 1$, and for $k=0$, $f(n,0)=f(n-1,0)+f(n-1,1)$. Indeed, we can prove this by considering the last two steps in a walk to height $2k$. If the last two steps are both up, then the number of such walks is the number of positive walks of length $2n-2$ to height $2k-2$, or $f(n-1,k-1)$. If the last two steps are up down or down up, then in each case we get a count of $f(n-1,k)$, and if the last two steps are both down, we get $f(n-1,k+1)$. The exception to this rule is when $k=0$ and so the last two steps could not have been down up or up up; in this case we have $f(n,0)=f(n-1,0)+f(n-1,1)$ as desired.


An explicit formula and a bijective proof

With a bit of guesswork and by unwrapping the recursion for $k=n-1,n-2,\ldots$, we can find that $$f(n,n-k)=\binom{2n}{k}-\binom{2n}{k-1}$$ satisfies the recursion. Indeed, it is a natural generalization of the closed formula for the Catalan numbers, which can also be written $$f(n,0)=\binom{2n}{n}-\binom{2n}{n-1}.$$

A few days later, Carlos noticed that we can prove this formula bijectively using the reflection principle. In an email, he explained:

“…I remembered the proof of the Catalan numbers using the reflection principle and thought, well, why don’t we try it for the generalized version? So here’s a more direct counting proof. I’m going to change the notation: let $g(n,k) = f(n,n-k)$. In $g(n,k)$, $k$ is the number of “down steps”, because, in order to get to the point $(2n,2(n-k))$, you need $k$ steps down and $2n-k$ steps up.

We are going to prove that
$$g(n,k) = \binom{2n}k – \binom{2n}{k-1}.$$

If we do not restrict the paths (that is, they are allowed to go below the $x$-axis), there are $\binom{2n}k$ choices ($k$ steps down, $2n-k$ steps up). We have to subtract the number of paths that go below the $x$-axis. Consider (quite like your previous post) the first time it hits the line $y=-1$, at point $P=(t,-1)$ and then reflect the path before $P$, that is, the points $(0,0), (1,a_1), \ldots, (t-1,0)$, about the line $y=-1$. For example:


bijects to


This bijects to a path from $(0,-2)$ to $(2n,2(n-k))$, which has the same number of steps up and down as a path from $(0,0)$ to $(2n,2(n-k+1))$, which has $k-1$ steps down and $2n-k+1$ steps up. In fact, any path from $(0,-2)$ to $(2n,2(n-k))$ has to hit the line $y=-1$, so the process is invertible. The number of such paths is $\binom{2n}{k-1}$. The result then follows.”


The distribution

Now that we have a formula, can we predict the asymptotic behaviour of the distribution of values as $n\to \infty$? The values appear to rise to a peak and then decrease again. Below is a plot of $g(100,k)$ for $k=0,\ldots,100$. Is it a normal distribution?


In fact, it turns out to approach the derivative of a normal distribution. The binomial distribution, that is, the discrete distribution having density function $B_{n}(x)=\frac{1}{2^n}\binom{n}{x}$, approaches a normal distribution as $n\to \infty$, having mean $n$ and standard deviation $\sqrt{n}/2$. Using the formula for a normal distribution and substituting $2n$ for $n$, we have that $\binom{2n}{x}$ approaches the function $$\frac{4^n}{\sqrt{\pi n}}e^{-(x-n)^2/n}.$$

Now, our formula for $g(n,k)$ is a “discrete derivative” of the sequence of binomial coefficients $\binom{2n}{x}$, so it will be approximated by the continuous derivative of the function defined above. This derivative comes out to:
$$4^n\cdot \frac{2(n-x)}{\sqrt{\pi n^3}}e^{-(x-n)^2/n}$$

Indeed, the plot for $n=200$ matches fairly closely!


A generating function

As a generatingfunctionologist, I can’t help but wrap up this post with a generating function for $g(n,k)$. Since $g(n,k)$ is the difference of consecutive coefficients of the generating function $$(1+x)^{2n}=\sum_{k}\binom{2n}{k}x^k,$$ we have that the first $n+1$ coefficients of $$(1-x)(1+x)^{2n}$$ are the values of $g(n,k)=\binom{2n}{k}-\binom{2n}{k-1}$ for $k=0,\ldots,n$.

Positive and recurrent walks

It is rather well-known that the number of up-down walks that start at $0$ and stay nonnegative until ending at $0$ after $2n$ steps is the $n$th Catalan number, $C_n=\frac{1}{n+1}\binom{2n}{n}$. Such paths, in which we either go up by $1$ or down by $1$ at each step and never fall below the $x$-axis, are called Dyck paths. An example for $n=5$ is shown below.


What is not so well-known is that:

The number of up-down walks of length $2n$ that stay nonnegative, but do not necessarily return to $0$ at the end, is simply $\binom{2n}{n}$.


This fact came up in my research a few days ago, and I figured that this had to be a trivial theorem that was easy to prove. Several hours of failed attempts later, I found myself googling “positive walks central binomial coefficients” and came across a wonderful recent paper of M. van Leeuwen.

He describes a bijective way of proving this fact by matching the walks with the recurrent (returning-to-zero) walks that are not necessarily positive. Clearly there are $\binom{2n}{n}$ such walks of the latter sort, since they can be thought of as walks from one corner to the opposite corner on a rotated $n\times n$ grid.


To construct the bijection, consider any recurrent walk from $0$ to $2n$. If the walk is already positive, simply map it to itself. If the walk becomes negative at any point, consider the first times the walk reaches $-1,-2,\ldots,-d$, where $-d$ is the minimum value of the walk. The step just before each of these positions must have been a down-step, and by simply changing each of these $d$ down-steps to an up-step, we end up with a positive walk of length $2n$.


To see that this is bijective, notice that if we had flipped $d$ of the down-steps of the recurrent walk, the last step of the new walk ends at height $2d$. So, to reverse the process, let $2d$ be the ending height of a given positive walk, and we wish to flip $d$ of the up-steps back to down-steps. But the ones we had flipped must be the steps following the last visits to heights $0,1,2,\ldots,d-1$ in the positive walk, by the above construction, so we simply find these visits and flip down the steps following each of them. Beautiful!

Teeter-totter transpositions

In attempting to write problems for the USAMO, I came across an interesting variant of Knuth equivalence whose structure is rather intricate.

Suppose we start with the numbers $1,2,3,\ldots,n$ written in a row. At each step, we may choose three adjacent numbers in the sequence, say $a,b,c$, and if either $a\lt b \lt c$ or $c\lt b\lt a$ then we can switch $a$ and $c$. So for instance, we might start with 12345 and change it to 32145. At this point we cannot use the subsequence 2,1,4 to change the 2 and 4, but we can use 1,4,5 and end up with 32541. The question is, how many possible sequences can we end up with after a sequence of such moves? (I call them `teeter-totter moves’, since you can either change three consecutive elements from increasing to decreasing or vice versa.)

Some quick hand calculations yield the values $1$, $1$, $2$, $3$, $6$, $10$, $20$, $35$, $70$, $126$ for $n=1$ up to $n=10$. But this sequence is quite recognizable: it is the sequence of “middle elements” in Pascal’s triangle:
So, it appears that the number of permutations of $1,2,\ldots,n$ that can be formed by a sequence of teeter-totter moves is $$\binom{n-1}{\lfloor\frac{n-1}{2}\rfloor}.$$ But why?

After struggling to answer this for quite a while, I stumbled across a solution in this paper by Linton, Propp, Roby, and West. It is their Theorem 14, and in my opinion one of the most impressive proofs of that paper. They essentially show that any achievable permutation can be decomposed into odd-length blocks of a certain form, and then enumerate these using generating function techniques.

My question is, is there a more direct approach? The quantity $\binom{n-1}{\lfloor\frac{n-1}{2}\rfloor}$ is too nice for there not to be a slick combinatorial interpretation that proves the result immediately. (It may be hard to prove that the said combinatorial interpretation is correct, but as long as it exists I would be happy.) Any ideas?

Knuth equivalence on a necklace

Lately, I’ve been working on some open problems related to a Young tableau operation called catabolism, which involves some interesting tableaux combinatorics. While working the other day, I encountered a simple and beautiful fact that I never would have expected.

Suppose we have a word $w$ whose letters are from the alphabet $\{1,\ldots,n\}$ (allowing repeated letters). A Knuth move is one of the following:

  1. Given three consecutive letters $xyz$ with $x>y$ and $x\le z$, replace $xyz$ with $xzy$, or vice versa (replace $xzy$ with $xyz$).
  2. Given three consecutive letters $xyz$ with $z\ge y$ and $z\lt x$, replace $xyz$ with $yxz$, or vice versa (replace $yxz$ with $xyz$).

In other words, if you have three consecutive letters and either the first or the third lies between the other two in size, the end letter acts as a pivot about which the other two letters switch places. Visually:


In the case of a tie, think of the repeated letters as getting larger as you go to the right. For instance, in the word $211$, the rightmost $1$ is “larger” than the middle $1$, so $211$ can be replaced with $121$ by switching the left $1$ and the $2$.

If two words can be reached from each other by a sequence of Knuth moves, we say they are Knuth equivalent. For instance, $12143$ is equivalent to $24113$ via the sequence of moves:


In addition, not all words on the same letters are equivalent: the word $123456$, for instance, has no valid Knuth moves and so it is only equivalent to itself.

Since the moves are invertible, Knuth equivalence sorts all words into Knuth equivalence classes. In 1970, Donald Knuth realized that these equivalence classes are in one-to-one correspondence with semistandard Young tableaux, which has made them very useful in algebraic combinatorics.

Recently, I needed to understand what happens when you allow the Knuth moves to “wrap around at the edge”. In other words, think of the word as being on a necklace or counterclockwise-oriented circle, and allow the same Knuth moves as before. For instance, $123456$ is now equivalent to $623451$ since the $5$ is between the next two letters, $1$ and $6$.

Now, we get “cyclic” Knuth equivlance classes. How do these relate to the original Knuth equivalence classes? It turns out that there is only one cyclic Knuth equivalence class for any given collection of letters. That is, for any two arrangements of the same collection of letters around a necklace, there is a sequence of cyclic Knuth moves that takes one to the other.

It’s a nice fact – can you see why it’s true?