Pythagorean triples on a sphere?

It’s 8/15/17, which means it’s time to celebrate! The three numbers making up the date today form a Pythagorean triple, a triple of positive integers $(a,b,c)$ with $a^2+b^2=c^2$. Indeed, $8^2+15^2=64+225=289=17^2$.

Alternatively, by the Pythagorean theorem, a Pythagorean triple is any triple of positive integers which make up the sides of a right triangle:

It’s exciting when all three sides are integers, since many common right triangles’ lengths involve square roots: $(1,1,\sqrt{2})$, $(1,2,\sqrt{5})$, and $(1,\sqrt{3},2)$, to name a few. And these sides aren’t even rational, which the poor Pythagorean scholar Hippasus discovered by proving that $\sqrt{2}$ is irrational and was subsequently drowned to death by his colleagues, according to some historical accounts. So the ancient Pythagoreans in fact only believed in right triangles having rational side lengths.

Of course, given one Pythagorean triple, like $(3,4,5)$, we can construct infinitely many by scaling the sides: $(6,8,10)$, $(9,12,15)$, etc. (In fact, 12/9/15 was the previous Pythagorean triple day, and 12/16/20 will be the next.) So to classify all Pythagorean triples, it suffices to find the reduced triples, those with no common factors greater than $1$.

So the last reduced Pythagorean triple day was back in 2013 on 12/5/13, and the next one won’t be until 7/24/25!

Constructing Pythagorean triples

It’s well known that there are infinitely many reduced Pythagorean triples as well. One beautiful, famous proof of this derives a parameterization of all triples via geometric means:

Consider the unit circle and the point $P=(-1,0)$ on the circle. Let $Q=(0,r/s)$ be a rational point on the $y$-axis inside the circle. Then line $PQ$ intersects the circle at a second point $R$, and it turns out $R$ has rational coordinates as well. Some simple algebra with similar triangles (try it, if you haven’t seen it before!) gives us $$R=\left(\frac{r^2-s^2}{r^2+s^2},\frac{2rs}{r^2+s^2}\right).$$ But since $R$ lies on the unit circle, if $(x,y)$ are the coordinates of $R$ then $x^2+y^2=1$, and substituting and clearing denominators we have $$(r^2-s^2)^2+(2rs)^2=(r^2+s^2)^2$$ (which can also be checked with direct algebraic computation). It follows that $(r^2-s^2,2rs,r^2+s^2)$ is a Pythagorean triple for any integers $r$ and $s$, giving us infinitely many Pythagorean triples. And in fact, for $r$ and $s$ relatively prime of different parity, these triples are reduced.

Spherical considerations

Given that this is a global day to celebrate (assuming you use the standard world calendar), and the Earth is a sphere, I have to wonder whether Pythagorean triples actually exist if drawn on a perfect-sphere approximation of the Earth. First, we’d have to define what we even mean by that – are we measuring in meters? In feet? And what is a right triangle on the surface of a sphere anyway?

The most common definition of a triangle on a sphere is one formed by great circles. A great circle is any circle of maximal diameter around a sphere, in other words, whose plane contains the center of the sphere. So, the equator, the prime meridian, and all longitude lines are great circles on the Earth, but latitude lines are not. A line segment on a sphere is a segment of a great circle, and a (spherical) triangle is a shape formed by three points connected by three (spherical) line segments. The angle between two great circles that intersect is the angle between their planes. (Thanks to Wikipedia for the excellent image below.)

Since our world now has size rather than being a flat plane, let’s set the radius of the Earth to be $1$ unit for simplicity. So we’re working with triangles $ABC$ with a right angle at $C$, and asking when they have rational lengths:

Are there any Pythagorean triples on the unit sphere? Are there infinitely many?

These questions suddenly aren’t very easy. If we scale our sphere down by $\pi/2$ we can get at least one, by taking the equator, the prime meridian, and the $90^\circ$ longitudinal line. This forms a right triangle with all three lengths equal (and all angles right!) and so we can simply scale the Earth to make it any rational lengths we want. But this still doesn’t answer anything about the unit sphere.

There is some hope, however. In this paper by Hartshorne and Van Luijk, the authors show that there are infinitely many Pythagorean triples in the hyperbolic plane, using the Poincare Disk model and some nice hyperbolic trig formulas combined with some Eulerian number theory tricks. So Pythagorean triples are not the sole property of flat Euclidean space.

In addition, there is a “spherical Pythagorean theorem”, which in our notation, since the radius of our sphere is $1$, says that $$\cos(c)=\cos(a)\cos(b)$$ where $a,b,c$ are the side lengths of the triangle and $c$ is opposite the right angle. And yet, this offers little insight into whether this equation has rational solutions. Trig functions are generally much harder to deal with than polynomials, especially when it comes to solving equations over the rationals.

So, if you have any insights on this problem (or references to papers that have worked on it – I couldn’t find any in an initial search, but I am not very familiar with it), please let me know in the comments! And until then, happy reduced-Pythagorean-triple-in-flat-Euclidean-space day!

A faster divisibility rule for $7$ (and $13$)

I occasionally teach evening online classes for Art of Problem Solving (artofproblemsolving.com, a fantastic resource for high school students interested in learning mathematics), and one of the lessons I taught recently was on fast mental arithmetic tricks for testing divisibility by various small integers.

How do you tell if a number is divisible by $2$? Easy: look at the last digit. If that digit is even, the whole number is divisible by $2$, otherwise it’s not.

How do you tell if a number is divisible by $3$? Easy: Add up the digits and see if the sum is divisible by $3$. For instance, $1642$ is not divisible by $3$ because $1+6+4+2=13$ is not, but $1644$ is because $1+6+4+4=15$ is divisible by $3$.

The list goes on; there are nice, well-known divisibility rules for $4$, $5$, $6$, $8$, $9$, $10$, $11$, and $12$. But $7$ turns out to be rather annoying. Many of the existing texts on this subject use a recursive rule that goes something like this: Cross off the last digit, double that digit, and subtract it from the number that remains. Then keep doing that until you get a small number and see if the result is divisible by $7$.

Ouch.

Not only is this method somewhat cumbersome, the small number you get in the end does not tell you the remainder mod $7$ (see this post for an introduction to modular arithmetic), only whether or not that remainder is $0$. But it turns out that one can derive a different method along the same lines as the divisibility rule for $3$. (See this Wikipedia article for many, many divisibility rules!)

First, let’s recall the proof of the rule for $3$. Given a number in base ten, like $1642$, we can write it out as a sum of powers of $10$:

$$1642=1\cdot 10^3+6\cdot 10^2+4\cdot 10+2$$
Notice that $10$ has a remainder of $1$ when divided by $3$, so when looking at remainders mod $3$, we can replace all these $10$’s in the expression above with $1$’s:
$$1642\equiv 1+6+4+2 \pmod{3}$$ and we have recovered the sum of the digits expression.

Similarly, for mod $7$, we can write things out in terms of powers of $10$, and reduce these powers modulo $7$. The number $10$ itself is $3$ mod $7$, then $100\equiv 10^2\equiv 3^2$ is $2$ mod $7$. The next remainders, for $10^3,10^4, 10^5$ are $6,4,5$, and then the remainder when $10^6$ is divided by $7$ is $1$ again, wrapping around to the same value as $10^0$. Writing this out in a table, we see a pattern start to emerge:

$$\begin{array}{cccccccccc}
n & 1 & 10 & 10^2 & 10^3 & 10^4 & 10^5 & 10^6 & 10^7 & 10^8 \\
n \bmod 7 & 1 & 3 & 2 & 6 & 4 & 5 & 1 & 3 & 2
\end{array}$$

Indeed, the remainders $1,3,2,6,4,5$ repeat every six steps, since at each step we are simply multiplying the previous remainder by $10$ and taking the result mod $7$. But we can make this pattern even simpler by replacing $6$ with $-1$, $4$ with $-3$, and $5$ with $-2$, which still have the same remainders mod $7$ (they differ by $7$ from the originals):

$$\begin{array}{cccccccccc}
n & 1 & 10 & 10^2 & 10^3 & 10^4 & 10^5 & 10^6 & 10^7 & 10^8 \\
n \bmod 7 & 1 & 3 & 2 & -1 & -3 & -2 & 1 & 3 & 2
\end{array}$$

So, the pattern simply continues $1,3,2,-1,-3,-2,1,3,2,-1,-3,-2,\ldots$. We can therefore replace the powers of $10$ in a base ten expansion by these smaller digits to compute the number mod $7$. For instance, $$1642=10^3+6\cdot 10^2+4\cdot 10+2\equiv -1+6\cdot 2+4\cdot 3+2=25\equiv 4\pmod 7,$$ so $1642$ has a remainder of $4$ when divided by $7$.

As a larger example, starting with the number $133251140379$, we can make a simple table with the pattern $1,3,2,-1,-3,-2,\ldots$ underneath the digits from right to left (we reverse the direction since the powers of $10$ increase to the left):

$$\begin{array}{cccccccccccc}
1 & 3 & 3 & 2 & 5 & 1 & 1 & 4 & 0 & 3 & 7 & 9 \\
-2 & -3& -1& 2 & 3 & 1 & -2& -3& -1& 2 & 3 & 1
\end{array}$$

Then we can simply multiply the pairs of numbers in column together (and take the results mod 7) and add the results mod $7$. We can write each product below the pairs:

$$\begin{array}{cccccccccccc}
1 & 3 & 3 & 2 & 5 & 1 & 1 & 4 & 0 & 3 & 7 & 9 \\
-2 & -3& -1& 2 & 3 & 1 & -2& -3& -1& 2 & 3 & 1 \\\hline
-2 & -2& -3& 4 & 1 & 1 & -2& -5& 0 & 6 & 0 & 2
\end{array}$$

(Notice that in the computation of the products we sometimes automatically take the result mod $7$: for instance, $5\cdot 3=15\equiv 1\pmod 7$, so under the fifth column we simply write $1$.) Finally, the sum of the numbers in the bottom row is $-2-2-3+4+1+1-2-5+6+2=0$, so $133251140379$ is indeed divisible by $7$.

I tested this method by hand on a few large numbers and compared my calculation time to the iterative subtraction method, and it saves a good chunk of time. The particular number above took me 35 seconds to check with the $1,3,2$-pattern method, and 69 seconds to check with the double-and-subtract rule. Try it out, and let me know in the comments whether you find this method easier!

Addendum: A similar rule for $13$

While this method may become more cumbersome for larger moduli (which will have a longer pattern to memorize), it turns out that powers of $10$ mod $13$ also have an easily-memorizable repeating pattern:

$$\begin{array}{cccccccccc}
n & 1 & 10 & 10^2 & 10^3 & 10^4 & 10^5 & 10^6 & 10^7 & 10^8 \\
n \bmod 13 & 1 & -3 & -4 & -1 & 3 & 4 & 1 & -3 & -4
\end{array}$$

The pattern is $1,-3,-4,-1,3,4$ repeating, which can be memorized by taking the length $3$ string $1,3,4$ and then writing negative signs on every other block of three entries starting at the tens digit. Let’s try it out on our example number above.

$$\begin{array}{cccccccccccc}
1 & 3 & 3 & 2 & 5 & 1 & 1 & 4 & 0 & 3 & 7 & 9 \\
4 & 3 &-1 &-4 &-3 & 1 & 4 & 3 & -1&-4 &-3 & 1 \\\hline
4 &-4 &-3 & 5 &-2 & 1 & 4 & -1& 0 & 1 & 5 & -4
\end{array}$$
We have $4-4-3+5-2+1+4-1+1+5-4=6$, so $133251140379$ has a remainder of $6$ when divided by $13$.

Shifted partitions and the Orthogonal Grassmannian

In a previous post, we discussed Schubert calculus in the Grassmannian and the various intersection problems it can solve. I have recently been thinking about problems involving the type B variant of Schubert calculus, namely, intersections in the orthogonal Grassmanian. So, it’s time for a combinatorial introduction to the orthogonal Grassmannian!

What is the orthogonal Grassmannian?

In order to generalize Grassmannians to other Lie types, we first need to understand in what sense the ordinary Grassmannian is type A. Recall from this post that the complete flag variety can be written as a quotient of $G=\mathrm{GL}_n$ by a Borel subgroup $B$, such as the group of upper-triangular matrices. It turns out that all partial flag varieties, the varieties of partial flags of certain degrees, can be similarly defined as a quotient $$G/P$$ for a parabolic subgroup $P$, namely a closed intermediate subgroup $B\subset P\subset G$.

The (ordinary) Grassmannian $\mathrm{Gr}(n,k)$, then, can be thought of as the quotient of $\mathrm{GL}_n$ by the parabolic subgroup $S=\mathrm{Stab}(V)$ where $V$ is any fixed $k$-dimensional subspace of $\mathbb{C}^n$. Similarly, we can start with a different reductive group, say the special orthogonal group $\mathrm{SO}_{2n+1}$, and quotient by parabolic subgroups to get partial flag varieties in other Lie types.

In particular, the orthogonal Grassmannian $\mathrm{OG}(2n+1,k)$ is the quotient $\mathrm{SO}_{2n+1}/P$ where $P$ is the stabilizer of a fixed isotropic $k$-dimensional subspace $V$. The term isotropic means that $V$ satisfies $\langle v,w\rangle=0$ for all $v,w\in V$ with respect to a chosen symmetric bilinear form $\langle,\rangle$.

The isotropic condition, at first glance, seems very unnatural. After all, how could a nonzero subspace possibly be so orthogonal to itself? Well, it is first important to note that we are working over $\mathbb{C}$, not $\mathbb{R}$, and the bilinear form is symmetric, not conjugate-symmetric. So for instance, if we choose a basis of $\mathbb{C}^{2n+1}$ and define the bilinear form to be the usual dot product $$\langle (a_1,\ldots,a_{2n+1}),(b_1,\ldots,b_{2n+1})\rangle=a_1b_1+a_2b_2+\cdots+a_{2n+1}b_{2n+1},$$ then the vector $(3,5i,4)$ is orthogonal to itself: $3\cdot 3+5i\cdot 5i+4\cdot 4=0$.

While the choice of symmetric bilinear form does not change the fundamental geometry of the orthogonal Grassmannian, one choice in particular makes things easier to work with in practice: the “reverse dot product” given by
$$\langle (a_1,\ldots,a_{2n+1}),(b_1,\ldots,b_{2n+1})\rangle=\sum_{i=1}^{2n+1} a_ib_{2n+1-i}.$$ In particular, with respect to this symmetric form, the “standard” complete flag $\mathcal{F}$, in which $\mathcal{F}_i$ is the span of the first $i$ rows of the identity matrix $I_{2n+1}$, is an orthogonal flag, with $\mathcal{F}_i^\perp=\mathcal{F}_{2n+1-i}$ for all $i$. Orthogonal flags are precisely the type of flags that are used to define Schubert varieties in the orthogonal grassmannian.

Other useful variants of the reverse dot product involve certain factorial coefficients, but for this post this simpler version will do.

Going back to the main point, note that isotropic subspaces are sent to other isotropic subspaces under the action of the orthorgonal group: if $\langle v,w\rangle=0$ then $\langle Av,Aw\rangle=\langle v,w\rangle=0$ for any $A\in \mathrm{SO}_{2n+1}$. Thus orthogonal Grassmannian $\mathrm{OG}(2n+1,k)$, which is the quotient $\mathrm{SO}_{2n+1}/\mathrm{Stab}(V)$, can be interpreted as the variety of all $k$-dimensional isotropic subspaces of $\mathbb{C}^{2n+1}$.

Schubert varieties and row reduction in $\mathrm{OG}(2n+1,n)$

Just as in the ordinary Grassmannian, there is a Schubert cell decomposition for the orthogonal Grassmannian, and the combinatorics of Schubert varieties is particularly nice in the case of $\mathrm{OG}(2n+1,n)$ in which the orthogonal subspaces are “half dimension” $n$. (In particular, this corresponds to the “cominuscule” type in which the simple root associated to our maximal parabolic subgroup is the special root in type $B$. See the introduction here or this book for more details.)

Recall that in $\mathrm{Gr}(2n+1,n)$, the Schubert varieties are indexed by partitions $\lambda$ whose Young diagram fit inside an $n\times (n+1)$ rectangle. Suppose we divide this rectangle into two staircases as shown below using the red cut, and only consider the partitions $\lambda$ that are symmetric with respect to the reflective map taking the upper staircase to the lower.

We claim that the Schubert varieties of the orthogonal Grassmannian are indexed by the “shifted partitions” formed by ignoring the lower half of these symmetric partition diagrams. In fact, the Schubert varieties consist of the isotropic elements of the ordinary Schubert varieties, giving a natural embedding $\mathrm{OG}(2n+1,n)\to \mathrm{Gr}(2n+1,n)$ that respects the Schubert decompositions.

To get a sense of how this works, let’s look at the example of the partition $(4,3,1)$ shown above, in the case $n=4$. As described in this post, in the ordinary Grassmannian, the Schubert cell $\Omega_\lambda^{\circ}$ with respect to the standard flag is given by the set of vector spaces spanned by the rows of matrices whose reduced row echelon form looks like:

Combinatorially, this is formed by drawing the staircase pattern shown at the left, then adding the partition parts $(4,3,1)$ to it, and placing a $1$ at the end of each of the resulting rows for the reduced row echelon form.

Now, which of these spaces are isotropic? In other words, what is $\Omega_\lambda^{\circ}\cap \mathrm{OG}(2n+1,n)$? Well, suppose we label the starred entries as shown, where we have ommited the $0$’s to make the entries more readable:

Then I claim that the entries $l,j,k,h,i,e$ are all uniquely determined by the values of the remaining variables $a,b,c,d,f,g$. Thus there is one isotropic subspace in this cell for each choice of values $a,b,c,d,f,g$, corresponding to the “lower half” of the partition diagram we started with:

Indeed, let the rows of the matrix be labeled $\mathbf{1},\mathbf{2},\mathbf{3},\mathbf{4}$ from top to bottom as shown, and suppose its row span is isotropic. Since row $\mathbf{1}$ and $\mathbf{4}$ are orthogonal with respect to the reverse dot product, we get the relation $$l+a=0,$$ which expresses $l=-a$ in terms of $a$.

Now, rows $\mathbf{2}$ and $\mathbf{4}$ are also orthogonal, which means that $$b+k=0,$$ so we can similarly eliminate $k$. From rows $\mathbf{2}$ and $\mathbf{3}$, we obtain $f+j=0$, which expresses $j$ in terms of the lower variables. We then pair row $\mathbf{3}$ with itself to see that $h+g^2=0$, eliminating $h$, and finally pairing $\mathbf{3}$ with $\mathbf{4}$ we have $i+gc+d=0$, so $i$ is now expressed in terms of lower variables as well.

Moreover, these are the only relations we get from the isotropic condition – any other pairings of rows give the trivial relation $0=0$. So in this case the Schubert variety restricted to the orthogonal Grassmannian has half the dimension of the original, generated by the possible values for $a,b,c,d,f,g$.

General elimination argument

Why does this elimination process work in general, for a symmetric shape $\lambda$? Label the steps of the boundary path of $\lambda$ by $1,2,3,\ldots$ from SW to NE in the lower left half, and label them from NE to SW in the upper right half, as shown:

Then the labels on the vertical steps in the lower left half give the column indices of the $1$’s in the corresponding rows of the matrix. The labels on the horizontal steps in the upper half, which match these labels by symmetry, give the column indices from the right of the corresponding starred columns from right to left.

This means that the $1$’s in the lower left of the matrix correspond to the opposite columns of those containing letters in the upper right half. It follows that we can use the orthogonality relations to pair a $1$ (which is leftmost in its row) with a column entry in a higher or equal row so as to express that entry in terms of other letters to its lower left. The $1$ is in a lower or equal row in these pairings precisely for the entries whose corresponding square lies above the staircase cut. Thus we can always express the upper right variables in terms of the lower left, as in our example above.

Isotropic implies symmetric

We can now conversely show that if a point of the Grassmannian is isotropic, then its corresponding partition is symmetric about the staircase cut. Indeed, we claim that if the partition is not symmetric, then two of the $1$’s of the matrix will be in opposite columns, resulting in a nonzero reverse dot product between these two rows.

To show this, first note that we cannot have a $1$ in the middle column, for otherwise its row would not be orthogonal to itself. Therefore, of the remaining $2n$ columns, some $n$ of them have a $1$ in them, and the complementary columns must be the other $n$, the ones without a $1$. But this is the same combinatorial data as choosing some $k$ columns from the first $n$ columns, and the remaining columns being forced to be a pivot column if their opposite is not and vice versa. This implies that the partition is symmetric.

Coming soon: Combinatorics of shifted partitions and tableaux

From the above analysis, it follows that the orthogonal Grassmannian is partitioned into Schubert cells given by the data of a shifted partition, the upper half of a symmetric partition diagram:

Such a partition is denoted by its distinct “parts”, the number of squares in each row, thought of as the parts of a partition with distinct parts shifted over by the staircase. For instance, the shifted partition above is denoted $(3,1)$.

The beauty of shifted partitions is that so much of the original tableaux combinatorics that goes into ordinary Schubert calculus works almost the same way for shifted tableaux and the orthogonal Grassmannian. We can define jeu de taquin, Knuth equivalence, and dual equivalence on shifted tableaux, there is a natural notion of a Littlewood-Richardson shifted tableau, and these notions give rise to formulas for both intersections of Schubert varieties in the orthogonal Grassmannian and to products of Schur $P$-functions or Schur $Q$-functions, both of which are specializations of the Hall-Littlewood polynomials.

I’ll likely go further into the combinatorics of shifted tableaux in future posts, but for now, I hope you enjoyed the breakdown of why shifted partitions are the natural indexing objects for the Schubert decomposition of the orthogonal Grassmannian.

Ellipses, parabolas, and infinity

A parabola can be defined as the locus of points equidistant from a fixed point (called the focus) and a fixed line (called the directrix). But we know from projective geometry that parabolas and ellipses are essentially the same object. Does this mean we can also define an ellipse in terms of a point and a line?

While tutoring a high school student recently, we worked through a problem that was essentially showing just that in a special case. Remarkably:

An ellipse can be defined as the locus of points $P$ for which the distance $|PF|$ to a focus $F$ is $\alpha$ times the distance from $P$ to a fixed line $\ell$ for some positive real number $\alpha\lt 1$.

For instance, if we make $\alpha=1/2$ we would get an ellipse, whereas if $\alpha=1$ then it’s such a large ellipse that it’s a parabola!

Let’s check the math here. Recall that an ellipse is usually defined, synthetically, as the locus of points $P$ in the plane for which the sum of the distances $|PF_1|+|PF_2|$ for two fixed foci $F_1$ and $F_2$ is a fixed constant $s$. By translating this condition into coordinates, one can show that if we place an ellipse with its two lines of symmetry aligned with the $x$- and $y$-axes (with the wider part on the $x$-axis), centered at the origin, then the ellipse will have equation $$(x/b)^2+(y/a)^2=1$$ for some positive real numbers $a$ and $b$ with $a\le b$.

For an ellipse $E$ with this equation, the focii $F_1$ and $F_2$ have coordinates $(-c,0)$ and $(c,0)$ for some $c$. To find $c$ in terms of $a$ and $b$, we have that the sum of the distances of the point $(b,0)$ to the foci is $b-c+b-c=2b$, and the sum of the distances of the point $(0,a)$ to the foci is $2\sqrt{a^2+c^2}$ by the Pythagorean theorem. Therefore $2b=2\sqrt{a^2+c^2}$, and solving we find $c=\sqrt{b^2-a^2}$.

Now, I claim that if we set $\alpha=\frac{\sqrt{b^2-a^2}}{b}$ and let $\ell$ be the vertical line $x=b/\alpha$, then $E$ is the locus of points $P$ for which $|PF_2|$ is $\alpha$ times the distance from $P$ to $\ell$. Indeed, let $P=(x,y)$ be a point that has this property. Then the distance $|PF_2|$ is $\sqrt{(x-\sqrt{b^2-a^2})^2+y^2}$ and the distance from $P$ to $\ell$ is $\frac{b}{\alpha}-x$, so we have
$$\begin{align*}
\sqrt{(x-\sqrt{b^2-a^2})^2+y^2} &= \alpha\left(\frac{b}{\alpha}-x\right) \\
(x-\sqrt{b^2-a^2})^2+y^2 &= (b-\alpha x)^2 \\
x^2-2x\sqrt{b^2-a^2}+b^2-a^2+y^2 &= b^2-2x \sqrt{b^2-a^2} + (\alpha x)^2 \\
x^2-a^2+y^2 &= \frac{b^2-a^2}{b^2}x^2 \\
\frac{a^2}{b^2}x^2 + y^2 &= a^2 \\
\frac{x^2}{b^2} + \frac{y^2}{a^2} &= 1
\end{align*}$$
which is indeed the equation of $E$.

A few noteworthy observations: first, it’s remarkable that the focus of the ellipse as defined in terms of the constant-sum-from-two-focii definition coincides with the focus that appears in the focus-and-directrix version. This makes one wonder if there is some natural relationship between the focii and this new directrix $\ell$, perhaps in terms of reciprocation (see this post for the definition of reciprocation in a circle.) And indeed, if we apply the transformation $(x,y)\mapsto (x/b,y/a)$, which maps our ellipse $E$ to the unit circle, the point $F_2$ maps to the point $(\alpha,0)$ and $\ell$ becomes the line $x=1/\alpha$, so indeed $F_2$ and $\ell$ form a reciprocal pair!

Second, consider the degenerate case of a circle, when $a=b$ and so $\alpha=0$. The condition $\alpha=0$ doesn’t really make sense unless we interpret the diagram in the projective plane and allow the directrix to be the line at infinity, which again makes the focus (the center of the circle) be the polar of this line.

Finally, consider the limit as $b$ approaches $\infty$, so that the ellipse is stretching out further and further until it becomes a parabola. (Exercise for the reader: What projective transformation maps an ellipse into a parabola?) In this limit we have $$\lim_{b\to \infty} \alpha = \lim_{b\to \infty} \frac{\sqrt{b^2-a^2}}{b}=\sqrt{\lim_{b\to \infty}1-(a^2/b^2)}=1,$$ and again we recover the case of a parabola. As a bonus, we find that the focus and directrix of a parabola must be reciprocal to each other across the parabola as well.

That’s all for today – a bit of fun with conics to kick off the new year. Happy and indivisible 2017!

Halloween Candy and Counting

Happy Halloween! It’s that time of year in which we celebrate ghosts, pumpkins, and fear itself. So, what better time to discuss a very common fear among adults these days: Mathematics!

If you’re reading this blog, I’m guessing you’re probably already not too afraid of mathematics. But I hope you share this post with people who are somewhat spooked by it but like to face their fears now and then. And let’s face it, even for math lovers, every difficult-sounding math problem is always a little scary at first… until you work it out and realize that there’s only beauty behind the mask.

I recently made up the following problem for a friend teaching a discrete mathematics class:

Five kids, dressed as a Ghost, a Witch, a Monster, a Skeleton, and a Black Cat, knock at your door. You open it and welcome them in, but you realize you only have $3$ Snickers bars and $3$ Kit Kats left in your candy stash!

Since you have $6$ pieces of candy and there are only $5$ kids, you decide to give both a Kit Kat and a Snickers bar to the scariest costume, and then give the remaining four kids one piece each. How many different ways can you choose who to give what candy to?

Eeek! Ghosts and witches! And combinatorics!

Well, let’s consider. There are $5$ ways you can decide on the scariest costume, since there are $5$ kids to choose from. But for each of those choices, you also have to pick which two of the remaining four get the Kit Kat and which two get the Snickers.

To make this a little easier, suppose we picked the Ghost as the scariest, and so we have to choose which two of the Witch, Monster, Skeleton, and Cat get the Snickers bars (and then the other two will get the Kit Kat). Well, we could pick one of the following options:
1. Witch and Monster
2. Witch and Skeleton
3. Witch and Cat
4. Monster and Skeleton
5. Monster and Cat
6. Skeleton and Cat.
(Notice that I ordered these by first choosing all those that can pair with the Witch, and then all those without the Witch, to make sure I didn’t miss any.)

So there are $6$ ways to choose who gets the Snickers and who gets the Kit Kats, assuming we chose Ghost as Scariest. But this is the same computation no matter who of the $5$ we chose as Scariest. If we chose the Witch as scariest there would still be $6$ possibilities, and same for the other three. Therefore, there are $5\cdot 6=30$ total possibilities.

The nice thing here is that there’s a known formula for computing the $6$ possibilities for the Kit Kats and Snickers – the number of ways of choosing $2$ things out of $4$ things is written $\binom{4}{2}$, pronounced “four choose two”. The formula for $\binom{a}{b}$, the number of ways to choose $b$ things from $a$ things, is known to be $\frac{a!}{b!\cdot (a-b)!}$ where $n!$ means the product of all the numbers from $1$ to $n$. So to compute $\binom{4}{2}$, we just compute $$\frac{4!}{2!\cdot 2!}=\frac{1\cdot 2 \cdot 3 \cdot 4}{1\cdot 2 \cdot 1\cdot 2}=6.$$ This is a shortcut for counting the possibilities without listing them all out.

Can you see why the formula for computing $\binom{a}{b}$ is true?

halloween-clip-art-happy-halloween-clip-art-5

The structure of the Garsia-Procesi modules $R_\mu$

Somehow, in all the time I’ve posted here, I’ve not yet described the structure of my favorite graded $S_n$-modules. I mentioned them briefly at the end of the Springer Correspondence series, and talked in depth about a particular one of them – the ring of coinvariants – in this post, but it’s about time for…

The Garsia-Procesi modules!

Also known as the cohomology rings of the Springer fibers in type $A$, or as the coordinate ring of the intersection of the closure of a nilpotent conjugacy class of $n\times n$ matrices with a torus, with a link between these two interpretations given in a paper of DeConcini and Procesi. But the work of Tanisaki, and Garsia and Procesi, allows us to work with these modules in an entirely elementary way.

Using Tanisaki’s approach, we can define $$R_{\mu}=\mathbb{C}[x_1,\ldots,x_n]/I_{\mu},$$ where $I_{\mu}$ is the ideal generated by the partial elementary symmetric functions defined as follows. Recall that the elementary symmetric function $e_d(z_1,\ldots,z_k)$ is the sum of all square-free monomials of degree $d$ in the set of variables $z_i$. Let $S\subset\{x_1,\ldots,x_n\}$ with $|S|=k$. Then the elementary symmetric function $e_r(S)$ in this subset of the variables is called a partial elementary symmetric function, and we have $$I_{\mu}=(e_r(S) : k-d_k(\mu) \lt r \le k, |S|=k).$$ Here, $d_k(\mu)=\mu’_n+\mu’_{n-1}+\cdots+ \mu’_{n-k+1}$ is the number of boxes in the last $k$ columns of $\mu$, where we pad the transpose partition $\mu’$ with $0$’s so that it has $n$ parts.

This ring $R_\mu$ inherits the natural action of $S_n$ on $\mathbb{C}[x_1,\ldots,x_n]$ by permuting the variables, since $I_\mu$ is fixed under this action. Since $I_\mu$ is also a homogeneous ideal, $R_\mu$ is a graded $S_n$-module, graded by degree.

To illustrate the construction, suppose $n=4$ and $\mu=(3,1)$. Then to compute $I_{\mu}$, first consider subsets $S$ of $\{x_1,\ldots,x_4\}$ of size $k=1$. We have $d_1(\mu)=0$ since the fourth column of the Young diagram of $\mu$ is empty (see image below), and so in order for $e_r(S)$ to be in $I_\mu$ we must have $1-0\lt r \le 1$, which is impossible. So there are no partial elementary symmetric functions in $1$ variable in $I_\mu$.

dkmu

For subsets $S$ of size $k=2$, we have $d_2(\mu)=1$ since there is one box among the last two columns (columns $3$ and $4$) of $\mu$, and we must have $2-1\lt r\le 2$. So $r$ can only be $2$, and we have the partial elementary symmetric functions $e_2(S)$ for all subsets $S$ of size $2$. This gives us the six polynomials $$x_1x_2,\hspace{0.3cm} x_1x_3,\hspace{0.3cm} x_1x_4,\hspace{0.3cm} x_2x_3,\hspace{0.3cm} x_2x_4,\hspace{0.3cm} x_3x_4.$$

For subsets $S$ of size $k=3$, we have $d_3(\mu)=2$, and so $3-2 \lt r\le 3$. We therefore have $e_2(S)$ and $e_3(S)$ for each such subset $S$ in $I_\mu$, and this gives us the eight additional polynomials $$x_1x_2+x_1x_3+x_2x_3, \hspace{0.5cm}x_1x_2+x_1x_4+x_2x_4,$$ $$x_1x_3+x_1x_4+x_3x_4,\hspace{0.5cm} x_2x_3+x_2x_4+x_3x_4,$$ $$x_1x_2x_3, \hspace{0.4cm} x_1x_2x_4, \hspace{0.4cm} x_1x_3x_4,\hspace{0.4cm} x_2x_3x_4$$

Finally, for $S=\{x_1,x_2,x_3,x_4\}$, we have $d_4(\mu)=4$ and so $4-4\lt r\le 4$. Thus all of the full elementary symmetric functions $e_1,\ldots,e_4$ in the four variables are also relations in $I_{\mu}$. All in all we have
$$\begin{align*}
I_{(3,1)}= &(e_2(x_1,x_2), e_2(x_1,x_3),\ldots, e_2(x_3,x_4), \\
& e_2(x_1,x_2,x_3), \ldots, e_2(x_2,x_3,x_4), \\
& e_3(x_1,x_2,x_3), \ldots, e_3(x_2,x_3,x_4), \\
& e_1(x_1,\ldots,x_4), e_2(x_1,\ldots,x_4), e_3(x_1,\ldots,x_4), e_4(x_1,\ldots,x_4))
\end{align*}$$

As two more examples, it’s clear that $R_{(1^n)}=\mathbb{C}[x_1,\ldots,x_n]/(e_1,\ldots,e_n)$ is the ring of coninvariants under the $S_n$ action, and $R_{(n)}=\mathbb{C}$ is the trivial representation. So $R_\mu$ is a generalization of the coinvariant ring, and in fact the graded Frobenius characteristic of $R_\mu$ is the Hall-Littlewood polynomial $\widetilde{H}_\mu(x;q)$.

Where do these relations come from? The rings $R_\mu$ were originally defined as follows. Let $A$ be a nilpotent $n\times n$ matrix over $\mathbb{C}$. Then $A$ has all $0$ eigenvalues, and so it is conjugate to a matrix in Jordan normal form whose Jordan blocks have all $0$’s on the diagonal. The sizes of the Jordan blocks, written in nonincreasing order form a partition $\mu’$, and this partition uniquely determines the conjugacy class of $A$. In other words:

There is exactly one nilpotent conjugacy class $C_{\mu’}$ in the space of $n\times n$ matrices for each partition $\mu’$ of $n$.

The closures of these conjugacy classes $\overline{C_{\mu’}}$ form closed matrix varieties, and their coordinate rings were studied here. However, they are easier to get a handle on after intersecting with the set $T$ of diagonal matrices, leading to an interesting natural question: what is the subvariety of diagonal matrices in the closure of the nilpotent conjugacy class $C_{\mu’}$? Defining $R_\mu=\mathcal{O}(\overline{C_{\mu’}}\cap T)$, we obtain the same modules as above.

Tanisaki found the presentation for $R_\mu$ given above using roughly the following argument. Consider the matrix $A-tI$, where $A\in C_{\mu’}$. Then one can show (see, for instance, the discussion of invariant factors and elementary divisors in the article on Smith normal form on Wikipedia) that the largest power of $t$ dividing all of the $k\times k$ minors of $A-tI$, say $t^{d_k}$, is fixed under conjugation, so we can assume $A$ is in Jordan normal form. Then it’s not hard to see, by analyzing the Jordan blocks, that this power of $t$ is $t^{d_k(\mu)}$ where $\mu$ is the transpose partition of $\mu’$ and $d_k(\mu)$ is defined as above – the sums of the ending columns of $\mu$.

It follows that any element of the closure of $C_\mu$ must also have this property, and so if $X=\mathrm{diag}(x_1,\ldots,x_n)\in \overline{C_\mu}\cap T$ then we have $$t^{d_k(\mu)} | (x_{i_1}-t)(x_{i_2}-t)\cdots (x_{i_k}-t)$$ for any subset $S=\{x_{i_1},\ldots,x_{i_k}\}$ of $\{x_1,\ldots,x_n\}$. Expanding the right hand side as a polynomial in $t$ using Vieta’s formulas, we see that the elementary symmetric functions $e_r(S)$ vanish on $X$ as soon as $r \gt k-d_k(\mu)$, which is exactly the relations that describe $I_\mu$ above.

It takes somewhat more work to prove that these relations generate the entire ideal, but this can be shown by showing that $R_\mu$ has the right dimension, namely the multinomial coefficient $\binom{n}{\mu}$. And for that, we’ll discuss on page 2 the monomial basis of Garsia and Procesi.

Can you Prove it… combinatorially?

This year’s Prove it! Math Academy was a big success, and it was an enormous pleasure to teach the seventeen talented high school students that attended this year. Some of the students mentioned that they felt even more inspired to study math further after our two-week program, but the inspiration went both ways – they inspired me with new ideas as well!

Maria Gemstones - 02

One of the many, many things we investigated at the camp was the Fibonacci sequence, formed by starting with the two numbers $0$ and $1$ and then at each step, adding the previous two numbers to form the next: $$0,1,1,2,3,5,8,13,21,\ldots$$ If $F_n$ denotes the $(n+1)$st term of this sequence (where $F_0=0$ and $F_1=1$), then there is a remarkable formula for the $n$th term, known as Binet’s Formula:

$$F_n=\frac{1}{\sqrt{5}}\left( \left(\frac{1+\sqrt{5}}{2}\right)^n – \left(\frac{1-\sqrt{5}}{2}\right)^n \right)$$

Looks crazy, right? Why would there be $\sqrt 5$’s showing up in a sequence of integers?

At Prove it!, we first derived the formula using generating functions. I mentioned during class that it can also be proven by induction, and later, one of our students was trying to work out the induction proof on a white board outside the classroom. She was amazed how many different proofs there could be of the same fact, and it got me thinking: what if we expand each of the terms using the binomial theorem? Is there a combinatorial proof of the resulting identity?

In particular, suppose we use the binomial theorem to expand $(1+\sqrt{5})^n$ and $(1-\sqrt{5})^n$ in Binet’s formula. The resulting expression is:

$$F_n=\frac{1}{\sqrt{5}\cdot 2^n}\left( \left(\sum_{i=0}^n \binom{n}{i}(\sqrt{5})^i \right) – \left(\sum_{i=0}^n (-1)^i\binom{n}{i}(\sqrt{5})^i \right) \right)$$

Note that the even terms in the two summations cancel, and combining the odd terms gives us:

$$F_n=\frac{1}{\sqrt{5}\cdot 2^n}\left( \sum_{k=0}^{\lfloor n/2\rfloor} 2 \binom{n}{2k+1}(\sqrt{5})^{2k+1} \right)$$
Since $(\sqrt{5})^{2k+1}=\sqrt{5}\cdot 5^k$, we can cancel the factors of $\sqrt{5}$ and multiply both sides by $2^{n-1}$ to obtain:

$$2^{n-1}\cdot F_n=\sum_{k=0}^{\lfloor n/2\rfloor} \binom{n}{2k+1}\cdot 5^k.$$

Now, the left hand and right hand side are clearly nonnegative integers, and one handy fact about nonnegative integers is that they count the number of elements in some collection. The proof method of counting in two ways is the simple principle that if by some method one can show that a collection $A$ has $n$ elements, and by another method one can show that $A$ has $m$ elements, then it follows that $m=n$. Such a “combinatorial proof” may be able to be used to prove the identity above, with $m$ being the left hand side of the equation and $n$ being the right.

I started thinking about this after Prove it! ended, and remembered that the $(n+1)$st Fibonacci number $F_n$ counts the number of ways to color a row of $n-2$ fenceposts either black or white such that no two adjacent ones are black. (Can you see why this combinatorial construction would satisfy the Fibonacci recurrence?) For instance, we have $F_5=5$, because there are five such colorings of a row of $3$ fenceposts:
$$\begin{array}{ccc} \square & \square & \square
\\ \\ \square & \square & \blacksquare
\\ \\ \square & \blacksquare & \square
\\ \\ \blacksquare & \square & \square
\\ \\ \blacksquare & \square & \blacksquare
\end{array}$$

Note also that $2^{n-1}$ counts the number of length-$(n-1)$ sequences of $0$’s and $1$’s. Thus, the left hand side of our identity, $2^{n-1}\cdot F_n$, counts the number of ways of choosing a binary sequence of length $n-1$ and also a fence post coloring of length $n-2$. Because of their lengths, given such a pair we can interlace their entries, forming an alternating sequence of digits and fence posts such as: $$1\, \square\, 0\, \square\, 1\, \blacksquare\, 1$$ We will call such sequences interlaced sequences.

We now need only to show that the right hand side also counts these interlaced sequences. See the next page for my solution, or post your own solution in the comments below!

PhinisheD!

Sometimes it’s the missteps in life that lead to the greatest adventures down the road.

For me, my pursuit of a Ph.D. in mathematics, specifically in algebraic combinatorics, might be traced back to my freshman year as an undergraduate at MIT. Coming off of a series of successes in high school math competitions and other science-related endeavors (thanks to my loving and very mathematical family!), I was a confident and excited 18-year old whose dream was to become a physicist and use my mathematical skills to, I don’t know, come up with a unified field theory or something.

Me at the age of 18-ish.

Me at the age of 18-ish.

But I loved pure math too, and a number of my friends were signed up for the undergraduate Algebraic Combinatorics class in the spring, so my young ambitious self added it to my already packed course load. I had no idea what “Algebraic Combinatorics” even meant, but I did hear that it was being taught by Richard Stanley, a world expert in the area. How could I pass up that chance? What if he didn’t teach it again before I left MIT?

On the first day of the class, Stanley started with a simple combinatorial question. It was something like the following: In a complete graph with $n$ vertices, how many walks of length $k$ starting at vertex $v$ end up back at vertex $v$ on the last step? For instance, if $n=5$ and $k=2$, the graph looks like: graph5-noblue and there are four closed walks of length two, from $v$ to any other vertex and back again:

graph5

There is an elementary (though messy) way to solve this, but Stanley went forth with an algebraic proof. He considered the adjacency matrix $A$ of the complete graph, and showed that the total number of loops of length $k$ starting from any vertex is the trace of $A^k$. One can then compute this trace using eigenvalues and divide by $n$ to get the number of loops starting at $v$. Beautiful!

I remember sitting in my seat, wide-eyed, watching Richard Stanley quietly but authoritatively discuss the technique. It was incredible to me that advanced tools from linear algebra could be used to so elegantly solve such a simple, concrete problem. To use a term from another area of algebraic combinatorics, I was hooked.

But I was also a freshman, and didn’t yet have a strong grasp of some of the other algebraic concepts being used in the course. I studied hard but wound up with a B+ in the class. Me, get a B+ in a math class? I was horrified, my 18-year-old Little-Miss-Perfect confidence shattered. Now, not only was I fascinated with the subject, I gained respect for it. It was a worthy challenge, and I couldn’t help but come back for more.

In the years that followed, I took more courses on similar subjects and wrote several undergraduate research papers. I dabbled in other areas as well, but was always drawn back to the interplay between combinatorics and algebra. I now find myself, as of Friday, May 20, 2016, having completed my Ph.D. at UC Berkeley on a topic in algebraic combinatorics…

Maria Grad - Web - 19

…and I often wonder how much that silly little B+ motivated me throughout the years.

(See page 2 for a summary of my thesis. My full thesis can be found here.)

What do Schubert curves, Young tableaux, and K-theory have in common? (Part III)

This is the third and final post in our expository series of posts (see Part I and Part II) on the recent paper coauthored by Jake Levinson and myself.

Last time, we discussed the fact that the operator $\omega$ on certain Young tableaux is actually the monodromy operator of a certain covering map from the real locus of the Schubert curve $S$ to $\mathbb{RP}^1$. Now, we’ll show how our improved algorithm for computing $\omega$ can be used to approach some natural questions about the geometry of the curve $S$. For instance, how many (complex) connected components does $S$ have? What is its genus? Is $S$ a smooth (complex) curve?

The genus of $S$

The arithmetic genus of a connected curve $S$ can be defined as $g=1-\chi(\mathcal{O}_S)$ where $$\chi(\mathcal{O}_S)=\dim H^0(\mathcal{O}_S)-\dim H^1(\mathcal{O}_S)$$ is the Euler characteristic and $\mathcal{O}_S$ is the structure sheaf. So, to compute the genus it suffices to compute the Euler characteristic, which can alternatively be defined in terms of the $K$-theory of the Grassmannian.

The $K$-theory ring $K(\mathrm{Gr}(n,k))$

The $K$-theory ring $K(X)$ of a scheme $X$ is defined as follows. First, consider the free abelian group $G$ generated by isomorphism classes of locally free coherent sheaves (a.k.a. vector bundles) on $X$. Then define $K(X)$, as a group, to be the quotient of $G$ by “short exact sequences”, that is, the quotient $G/H$ where $H$ is the subgroup generated by expressions of the form $[\mathcal{E}_1]+[\mathcal{E}_2]-[\mathcal{E}]$ where $0\to \mathcal{E}_1 \to \mathcal{E} \to \mathcal{E}_2 \to 0$ is a short exact sequence of vector bundles on $X$. This gives the additive structure on $K(X)$, and the tensor product operation on vector bundles makes it into a ring. It turns out that, in the case that $X$ is smooth (such as a Grassmannian!) then we get the exact same ring if we remove the “locally free” restriction and consider coherent sheaves modulo short exact sequences.

Where does this construction come from? Well, a simpler example of $K$-theory is the construction of the Grothendieck group of an abelian monoid. Consider an abelian monoid M (recall that a monoid is a set with an associative binary operation and an identity element, like a group without inverses). We can construct an associated group $K(M)$ by taking the quotient free abelian group generated by elements $[m]$ for $m\in M$ by the subgroup generated by expressions of the form $[m]+[n]-[m+n]$. So, for instance, $K(\mathbb{N})=\mathbb{Z}$. In a sense we are groupifying monoids. The natural monoidal operation on vector spaces is $\oplus$, so if $X$ is a point, then all short exact sequences split and the $K$-theory ring $K(X)$ is the Grothendieck ring of this monoid.

A good exposition on the basics of $K$-theory can be found here, and for the $K$-theory of Grassmannians, see Buch’s paper. For now, we’ll just give a brief description of how the $K$-theory of the Grassmannian works, and how it gives us a handle on the Euler characteristic of Schubert curves.

Recall from this post that the CW complex structure given by the Schubert varieties shows that the classes $[\Omega_\lambda]$, where $\lambda$ is a partition fitting inside a $k\times (n-k)$ rectangle, generate the cohomology ring $H^\ast(\mathrm{Gr}(n,k))$. Similarly, the $K$-theory ring is a filtered ring generated by the classes of the coherent sheaves $[\mathcal{O}_{\lambda}]$ where if $\iota$ is the inclusion map $\iota:\Omega_\lambda\to \mathrm{Gr}(n,k)$, then $\mathcal{O}_\lambda=\iota_\ast \mathcal{O}_{\Omega_\lambda}$. Multiplication of these basic classes is given by a variant of the Littlewood-Richardson rule: $$[\mathcal{O}_\lambda]\cdot [\mathcal{O}_\nu]=\sum_\nu (-1)^{|\nu|-|\lambda|-|\mu|}c^\nu_{\lambda\mu}[\mathcal{O}_\nu]$$ where if $|\nu|=|\lambda|+|\mu|$ then $c^{\nu}_{\lambda\mu}$ is the usual Littlewood-Richardson coefficient. If $|\nu|<|\lambda|+|\mu|$ then the coefficient is zero, and if it $|\nu|>|\lambda|+|\mu|$ then $c^{\nu}_{\lambda\mu}$ is a nonnegative integer. We will refer to these nonnegative values as $K$-theory coefficients.

$K$-theory and the Euler characteristic

The $K$-theory ring is especially useful in computing Euler characteristics. It turns out that the Euler characteristic gives an (additive) group homomorphism $\chi:K(X)\to \mathbb{Z}$. To show this, it suffices to show that if $0\to \mathcal{A}\to \mathcal{B}\to \mathcal{C}\to 0$ is a short exact sequence of coherent sheaves on $X$, then $\chi(\mathcal{A})+\chi(\mathcal{C})-\chi(\mathcal{B})=0$. Indeed, such a short exact sequence gives rise to a long exact sequence in cohomology:

$$
\begin{array}{cccccc}
& H^0(\mathcal{A}) & \to & H^0(\mathcal{B}) & \to & H^0(\mathcal{C}) \\
\to &H^1(\mathcal{A}) & \to & H^1(\mathcal{B}) & \to & H^1(\mathcal{C}) \\
\to &H^2(\mathcal{A}) & \to & H^2(\mathcal{B}) & \to & H^2(\mathcal{C}) \\
\cdots & & & & &
\end{array}
$$

and the alternating sum of the dimensions of any exact sequence must be zero. Thus we have $$\begin{eqnarray*}0&=&\sum_i (-1)^i\dim H^i(\mathcal{A})-\sum_i (-1)^i\dim H^i(\mathcal{b})+\sum_i (-1)^i\dim H^i(\mathcal{C}) \\ &=&\chi(\mathcal{A})+\chi(\mathcal{C})-\chi(\mathcal{B})\end{eqnarray*}$$ as desired.

Therefore, it makes sense to talk about the Euler characteristic of a class of coherent sheaves in $K(X)$. In fact, in our situation, we have a closed subset $S$ of $X=\mathrm{Gr}(n,k)$, say with inclusion map $j:S\to X$, and so the Euler characteristic of the pushforward $j_\ast\mathcal{O}_S$ is equal to $\chi(\mathcal{O}_S)$ itself. We can now compute the Euler characteristic $\chi(j_\ast\mathcal{O}_S)$ using the structure of the $K$-theory ring of the Grassmannian. Indeed, $S$ is the intersection of Schubert varieties indexed by the three partitions $\alpha$, $\beta$, and $\gamma$ (see Part I). So in the $K$-theory ring, if we identify structure sheaves of closed subvarieties with their pushforwards under inclusion maps, we have $$[\mathcal{O}_S]=[\mathcal{O}_\alpha]\cdot [\mathcal{O}_\beta]\cdot [\mathcal{O}_\gamma].$$ By the $K$-theoretic Littlewood-Richardson rule described above, this product expands as a sum of integer multiples of classes $[\mathcal{O}_\nu]$ where $|\nu|\ge |\alpha|+|\beta|+|\gamma|$. But in our setup we have $|\alpha|+|\beta|+|\gamma|=k(n-k)-1$, so $\nu$ is either the entire $k\times (n-k)$ rectangle (call this $\rho$) or it is the rectangle minus a single box (call this $\rho’$). In other words, we have:

$$[\mathcal{O}_S]=c^{\rho’}_{\alpha,\beta,\gamma}[\mathcal{O}_{\rho’}]-k[\mathcal{O}_{\rho}]$$ where $k$ is an integer determined by the $K$-theory coefficients. Notice that $c^{\rho’}_{\alpha,\beta,\gamma}$ is the usual Littlewood-Richardson coefficient, and counts exactly the size of the fibers (the set $\omega$ acts on) in our map from Part II. Let’s call this number $N$.

Finally, notice that $\Omega_\rho$ and $\Omega_{\rho’}$ are a point and a copy of $\mathbb{P}^1$ respectively, and so both have Euler characteristic $1$. It follows that $$\chi(\mathcal{O}_S)=N-k.$$
Going back to the genus, we see that if $S$ is connected, we have $g=1-\chi(\mathcal{O}_S)=k-(N-1)$.

Computing $k$ in terms of $\omega$

The fascinating thing about our algorithm for $\omega$ is that certain steps of the algorithm combinatorially correspond to certain tableaux that enumerate the $K$-theory coefficients, giving us information about the genus of $S$. These tableaux are called “genomic tableaux”, and were first introduced by Pechenik and Yong.

In our case, the genomic tableaux that enumerate $k$ can be defined as follows. The data of a tableau $T$ and two marked squares $\square_1$ and $\square_2$ in $T$ is a genomic tableau if:

  1. The marked squares are non-adjacent and contain the same entry $i$,
  2. There are no $i$’s between $\square_1$ and $\square_2$ in the reading word of $T$,
  3. If we delete either $\square_1$ or $\square_2$, every suffix of the resulting reading word is ballot (has more $j$’s than $j+1$’s for all $j$).

For instance, consider the following tableaux with two marked (shaded) squares:

Kthy1-page-001

Property 1 means that the fourth tableau is not genomic: the marked squares, while they do contain the same entry, are adjacent squares. The first tableau above violates Property 2, because there is a $2$ between the two marked $2$’s in reading order. Finally, the second tableau above violates Property 3, because if we delete the top marked $1$ then the suffix $221$ is not ballot. The third tableau above satisfies all three properties, and so it is genomic.

Finally, consider the algorithm for $\omega$ described in Part I. Jake and I discovered that the steps in Phase 1 in which the special box does not move to an adjacent square are in bijective correspondence with the $K$-theory tableau of the total skew shape $\gamma^c/\alpha$ and content $\beta$ (where the marked squares only add $1$ to the total number of $i$’s). The correspondence is formed by simply filling the starting and ending positions of the special box with the entry $i$ that it moved past, and making these the marked squares of a genomic tableau. In other words:

The $K$-theory coefficient $k$ is equal to the number of non-adjacent moves in all Phase 1’s of the local algorithm for $\omega$.

Geometric consequences

This connection allows us to get a handle on the geometry of the Schubert curves $S$ using our new algorithm. As one illuminating example, let’s consider the case when $\omega$ is the identity permutation.

It turns out that the only way for $\omega$ to map a tableau back to itself is if Phase 1 consists of all vertical slides and Phase 2 is all horizontal slides; then the final shuffle step simply reverses these moves. This means that we have no non-adjacent moves, and so $k=0$ in this case. Since $\omega$, the monodromy operator on the real locus, is the identity, we also know that the number of real connected components is equal to $N$, which is an upper bound on the number of complex connected components (see here), which in turn is an upper bound on the Euler characteristic $\chi(\mathcal{O}_S)=\dim H^0(\mathcal{O}_S)-\dim H^1(\mathcal{O}_S)$. But the Euler characteristic is equal to $N$ in this case, and so there must be $N$ complex connected components, one for each of the real connected components. It follows that $\dim H^1(\mathcal{O}_S)=0$, so the arithmetic genus of each of these components is zero.

We also know each of these components is integral, and so they must each be isomorphic to $\mathbb{CP}^1$ (see Hartshorne, section 4.1 exercise 1.8). We have therefore determined the entire structure of the complex Schubert curve $S$ in the case that $\omega$ is the identity map, using the connection with $K$-theory described above.

Similar analyses lead to other geometric results: we have also shown that the Schubert curves $S$ can have arbitrarily high genus, and can arbitrarily many complex connected components, for various values of $\alpha$, $\beta$, and $\gamma$.

So, what do Schubert curves, Young tableaux, and $K$-theory have in common? A little monodromy operator called $\omega$.

What do Schubert curves, Young tableaux, and K-theory have in common? (Part II)

In the last post, we discussed an operation $\newcommand{\box}{\square}
\omega$ on skew Littlewood-Richardson Young tableaux with one marked inner corner, defined as a commutator of rectification and shuffling. As a continuation, we’ll now discuss where this operator arises in geometry.

Schubert curves: Relaxing a Restriction

Recall from our post on Schubert calculus that we can use Schubert varieties to answer the question:

Given four lines $\ell_1,\ell_2,\ell_3,\ell_4\in \mathbb{C}\mathbb{P}^3$, how many lines intersect all four of these lines in a point?

In particular, given a complete flag, i.e. a chain of subspaces $$0=F_0\subset F_1\subset\cdots \subset F_m=\mathbb{C}^m$$ where each $F_i$ has dimension $i$, the Schubert variety of a partition $\lambda$ with respect to this flag is a subvariety of the Grassmannian $\mathrm{Gr}^n(\mathbb{C}^m)$ defined as
$$\Omega_{\lambda}(F_\bullet)=\{V\in \mathrm{Gr}^n(\mathbb{C}^m)\mid \dim V\cap F_{n+i-\lambda_i}\ge i.\}$$

Here $\lambda$ must fit in a $k\times (n-k)$ box in order for the Schubert variety to be nonempty. In the case of the question above, we can translate the question into an intersection problem in $\mathrm{Gr}^2(\mathbb{C}^4)$ with four general two-dimensional subspaces $P_1,P_2,P_3,P_4\subset \mathbb{C}^4$, and construct complete flags $F_\bullet^{(1)},F_\bullet^{(2)},F_\bullet^{(3)},F_\bullet^{(4)}$ such that their second subspace $F^{(i)}_2$ is $P_i$ for each $i=1,2,3,4$. Then the intersection condition is the problem of finding a plane that intersects all $P_i$’s in a line. The variety of all planes intersecting $P_i$ in a line is $\Omega_\box(F_\bullet^{(i)})$ for each $i$, and so the set of all solutions is the intersection $$\Omega_\box(F_\bullet^{(1)})\cap \Omega_\box(F_\bullet^{(2)})\cap \Omega_\box(F_\bullet^{(3)})\cap \Omega_\box(F_\bullet^{(4)}).$$ And, as discussed in our post on Schubert calculus, since the $k\times(n-k)$ box has size $4$ in $\mathrm{Gr}^2(\mathbb{C}^4)$ and the four partitions involved have sizes summing to $4$, this intersection has dimension $0$. The Littlewood-Richardson rule then tells us that the number of points in this zero-dimensional intersection is the Littlewood-Richardson coefficient $c_{\box,\box,\box,\box}^{(2,2)}$.

What happens if we relax one of the conditions in the problem, so that we are only intersecting three of the Schubert varieties above? In this case we get a one-parameter family of solutions, which we call a Schubert curve.

To define a Schubert curve in general, we require a sufficiently “generic” choice of $r$ flags $F_\bullet^{(1)},\ldots, F_\bullet^{(r)}$ and a list of $r$ partitions $\lambda^{(1)},\ldots,\lambda^{(r)}$ (fitting inside the $k\times (n-k)$ box) whose sizes sum to $k(n-k)-1$. It turns out that one can choose any flags $F_\bullet$ defined by the iterated derivatives at chosen points on the rational normal curve, defined by the locus of points of the form $$(1:t:t^2:t^3:\cdots:t^{n-1})\in \mathbb{CP}^n$$ (along with the limiting point $(0:0:\cdots:1)$.) In particular, consider the flag whose $k$th subspace $F_k$ is the span of the first $k$ rows of the matrix of iterated derivatives at the point on this curve parameterized by $t$:
$$\left(\begin{array}{cccccc}
1 & t & t^2 & t^3 & \cdots & t^{n-1} \\
0 & 1 & 2t & 3t^2 & \cdots & (n-1) t^{n-2} \\
0 & 0 & 2 & 6t & \cdots & \frac{(n-1)!}{(n-3)!} t^{n-3} \\
0 & 0 & 0 & 6 & \cdots & \frac{(n-1)!}{(n-4)!} t^{n-3} \\
\vdots & \vdots & \vdots & \vdots &\ddots & \vdots \\
0 & 0 & 0 & 0 & \cdots & (n-1)!
\end{array}\right)$$

This is called the osculating flag at the point $t$, and if we pick a number of points on the curve and choose their osculating flag, it turns out that they are in sufficiently general position in order for the Schubert intersections to have the expected dimension. So, we pick some number $r$ of these osculating flags, and choose exactly $r$ partitions $\lambda^{(1)},\ldots,\lambda^{(r)}$ with sizes summing to $k(n-k)-1$. Intersecting the resulting Schubert varieties defines a Schubert curve $S$.

A covering map

In order to get a handle on these curves, we consider the intersection of $S$ with another Schubert variety of a single box, $\Omega_\box(F_\bullet)$. In particular, after choosing our $r$ osculating flags, choose an $(r+1)$st point $t$ on the rational normal curve and choose the single-box partition $\lambda=(1)$. Intersecting the resulting Schubert variety $\Omega_\box(F_\bullet)$ with our Schubert curve $S$ gives us a zero-dimensional intersection, with the number of points given by the Littlewood-Richardson coefficient $c:=c^{B}_{\lambda^{(1)},\ldots,\lambda^{(r)},\box}$ where $B=((n-k)^k)$ is the $k\times (n-k)$ box partition.

By varying the choice of $t$, we obtain a partition of an open subset of the Schubert curve into sets of $c$ points. We can then define a map from this open subset of $S$ to the points of $\mathbb{CP}^1$ for which the preimage of $(1:t)$ consists of the $c$ points in the intersection given by choosing the $(r+1)$st point to be $(1:t:t^2:t^3:\cdots:t^{n-1})$.

In this paper written by my coauthor Jake Levinson, it is shown that if we choose all $r+1$ points to be real, then this can be extended to a map $S\to \mathbb{CP}^1$ for which the real locus $S(\mathbb{R})$ is a smooth, finite covering of $\mathbb{RP}^1$. The fibers of this map have size $c=c^{B}_{\lambda^{(1)},\ldots,\lambda^{(r)},\box}$. Note that this Littlewood-Richardson coefficient counts the number of ways of filling the box $B$ with skew Littlewood-Richardson tableaux with contents $\lambda^{(1)},\ldots,\lambda^{(r)},\box$ such that each skew shape extends the previous shape outwards. In addition, by the symmetry of Littlewood-Richardson coefficients, it doesn’t matter what order in which we use the partitions. It turns out that there is a canonical way of labeling the fibers by these chains of tableaux, for which the monodromy of the covering is described by shuffling operators.

Let’s be more precise, and consider the simplest interesting case, $r=3$. We have three partitions $\alpha$, $\beta$, and $\gamma$ such that $$|\alpha|+|\beta|+|\gamma|=k(n-k)-1=|B|-1.$$ Let’s choose, for simplicity, the three points $0$, $1$, and $\infty$ to define the respective osculating flags. Then we can label the points in the fiber $f^{-1}(0)$ of the map $f:S\to \mathbb{RP}^1$ by the fillings of the box with a chain of Littlewood-Richardson tableaux of contents $\alpha$, $\box$, $\beta$, and $\gamma$ in that order. Note that there is only one Littlewood-Richardson tableau of straight shape and content $\alpha$, and similarly for the anti-straight shape $\gamma$, so the data here consists of a skew Littlewood-Richardson tableau of content $\beta$ with an inner corner chosen to be the marked box. This is the same object that we were considering in Part I.

Now, we can also label the fiber $f^{-1}(1)$ by the chains of contents $\box$, $\alpha$, $\beta$, and $\gamma$ in that order, and finally label the fiber $f^{-1}(\infty)$ by the chains of contents $\alpha$, $\beta$, $\box$, and $\gamma$ in that order, in such a way that if we follow the curve $S$ along an arc from a point in $f^{-1}(0)$ to $f^{-1}(1)$ or similarly from $1$ to $\infty$ or $\infty$ to $0$, then then the map between the fibers is given by the shuffling operations described in the last post! In particular if we follow the arc from $0$ to $\infty$ that passes through $1$, the corresponding operation on the fibers is given by the “evacuation-shuffle”, or the first three steps of the operator $\omega$ described in the previous post. The arc from $\infty$ back to $0$ on the other side is given by the “shuffle” of the $\box$ past $\beta$, which is the last step of $\omega$. All in all, $\omega$ gives us the monodromy operator on the zero fiber $f^{-1}(0)$.

The following picture sums this up:

newest-covering-Maria-modified

So, the real geometry of the Schubert curve boils down to an understanding of the permutation $\omega$. Our local algorithm allows us to get a better handle on the orbits of $\omega$, and hence tell us things about the number of real connected components, the lengths of these orbits, and even in some cases the geometry of the complex curve $S$.

Next time, I’ll discuss some of these consequences, as well as some fascinating connections to the $K$-theory of the Grassmannian. Stay tuned!