The Garsia-Procesi Modules: Part 2

This is a long overdue followup post to Garsia-Procesi Modules: Part 1 that I finally got around to editing and posting. Enjoy!

In this post, I talked about the combinatorial structure of the Garsia-Procesi modules $R_\mu$, the cohomology rings of the type A Springer fibers. Time to dive even further into the combinatorics!

Tanisaki generators, visually

Recall that, for a partition $\mu$ of $n$, the graded $S_n$-module $R_\mu$ can be constructed (due to Tanisaki) as the quotient ring $$\mathbb{C}[x_1,\ldots,x_n]/I_\mu$$ where $I_\mu$ is generated by certain partial elementary symmetric functions called Tanisaki generators.

Define $d_k(\mu)=\mu’_{n-k+1}+\mu’_{n-k+2}+\cdots$ to be the sum of the last $k$ columns of $\mu$, where we pad the conjugate partition $\mu’$ with $0$’s in order to think of it as as a partition of $n$ having $n$ parts. Then the partial elementary symmetric function $e_r(x_{i_1},\ldots,x_{i_k})$ is a Tanisaki generator if and only if $k-d_k(\mu)\lt r\le k$. Tanisaki and Garsia and Procesi both use this notation, but I find $d_k$ hard to remember and compute with, especially since it involves adding zeroes to $\mu$ and adding parts of its transpose in reverse order, and then keeping track of an inequality involving it to compute the generators.

An equivalent, and perhaps simpler, definition is as follows. If $n-k\lt \mu_1$, then the quantity $k-d_k(\mu)$ is equal to the number of squares in the first $n-k$ columns of $\mu$, excluding the first row. Indeed, we have $$k-d_k(\mu)=n-d_k(\mu)-(n-k),$$ and on the right hand side, $n-d_k(\mu)$ is simply the number of squares in the first $n-k$ columns of $\mu$. Subtracting $n-k$ is then equivalent to removing one square from each column in that count, which can be done by crossing out the first row of $\mu$.

So, define $$s_t(\mu)=\mu’_1+\cdots+\mu’_t-t$$ to be the number of squares in the first $t$ columns, not including the first row. Then we include an elementary symmetric function $e_r$ in $k$ of the variables $x_i$ if $$r\gt s_{n-k}(\mu).$$ (Note that the upper bound condition, $r\le k$, is necessary for the elementary symmetric function to be nonzero, so we technically don’t need to state the upper bound).

For instance, if $n=8$ and $\mu=(4,3,1)$, then the partition diagram looks like:

where the bottom row is x’ed out to remind ourselves not to count it.

For $k=8$, we require $r\gt s_0(\mu)=0$, so all elementary symmetric functions $e_1,\ldots,e_8$ in the $8$ variables $x_1,\ldots,x_8$ are in $I_\mu$.

For $k=7$ we require $r\gt s_1(\mu)=2$, so $e_3,e_4,\ldots,e_7$ on any seven of the variables are generators.

For $k=6$ we require $r \gt s_2(\mu)=3$, so $e_4,e_5,e_6$ on any six of the variables are generators.

For $k=5$ we require $r \gt s_3(\mu)=4$, so only $e_5$ on any five of the variables is a generator.

For $k=4$ we require $r\gt s_4(\mu)=4$, and we have no additional generators.

For $k$ such that $n-k>\mu_1$, we also never get any additional generators, since $k-d_k(\mu)=k$ in this case. Therefore the combinatorial interpretation above, which is only valid for $n-k<\mu_1$, is in fact an equivalent definition for the Tanisaki generators.

It’s not fundamentally all that different, but working with columns of a partition from the left rather than the right can save a bit of mental space during computations.

A smaller set of generators

It turns out that, for $|S|<n$, it suffices to only include $e_r(S)$ for the smallest possible $r$ in order to generate the Tanisaki ideal. We prove this here. Write $X={x_1,\ldots,x_n}$ throughout.

Proposition. The ideal $I_\mu$ is generated by the following elements:

  • The elementary symmetric functions $e_1,\ldots,e_n$ in all $n$ variables $x_1,\ldots,x_n$, and
  • The partial elementaries $e_r(S)$ where $S\subseteq X$ with $|S|=k<n$ and $$r=s_{n-k}(\mu)+1.$$

Proof. Let $I_\mu^0$ be the ideal generated by the above functions. To show $I_\mu^0=I_\mu$, it suffices to show that for every $k s_{n-k}(\mu)$. We show this by nested induction, first on $n-k$ and then on $r$.

For the base case, $n-k=0$, there is nothing to show since $k=n$. Now fix $k$ and assume the claim holds for all smaller values of $n-k$ (for all larger values of $k$). Let $S\subseteq X$ with $|S|=k$, and since $k<n$, choose a missing variable $x_i\in X-S$.

Define $r_0=s_{n-k}(\mu)+1$; we have $e_{r_0}(S)\in I_\mu^0$ by the definition of $I_\mu^0$. Now let $r>r_0$ and assume $e_t(S)\in I_\mu^0$ for $r_0\le ts_{n-(k+1)}(\mu)$ and so $e_r(S\cup {x_i})\in I_\mu^0$ by the induction hypothesis on $n-k$. Therefore $e_r(S)\in I_\mu^0$ as desired. $\square$

Recursive algorithm for expanding in basis $\mathcal{B}(\mu)$

Garsia and Procesi define a basis $\mathcal{B}(\mu)$ of monomials for $R_\mu$ recursively by $$\mathcal{B}(\mu)=\bigcup_i x_n^{i-1}\mathcal{B}(\mu^{(i)}).$$ (Recall that $\mu^{(i)}$ is the partition of $n-1$ formed by removing one square from the $i$th part and re-sorting the resulting rows in partition order.) They then give a completely elementary inductive proof that every other element of $R_\mu$ can be expressed as a linear combination of basis elements from $\mathcal{B}(\mu)$. We describe their algorithm here, with a few slight modifications.

First, note that it suffices to describe how to express any monomial $x^\alpha$ as a linear combination of the elements of $\mathcal{B}_\mu$, plus an element of $I_\mu$. For $n=0$, the only partition is the empty partition, $R_\emptyset=\mathbb{C}$, and the unique basis element is $1$, so $x^\alpha=x^0=1$ is its entire expansion (since $I_\emptyset=(0)$).

For $n>0$, we use the following recursive algorithm.

  1. Given $x^\alpha=x_1^{\alpha_1}\cdots x_n^{\alpha_n}$, let $i-1=\alpha_n$ be the exponent of $x_n$.
  2. Setting $\beta=(\alpha_1,\ldots,\alpha_{n-1})$, we have that $x^{\beta}=x_1^{\alpha_1}\cdots x_{n-1}^{\alpha_{n-1}}$ can be interpreted as a representative of an element in $R_{\mu^{(i)}}$. Use this algorithm recursively to express $x^\beta$ in terms of $\mathcal{B}(\mu^{(i)})$ plus an error term in $I_{\mu^{(i)}}$, giving an expansion $$x^\beta=\sum_{b\in \mathcal{B}(\mu^{(i)})} c_b b(x_1,\ldots,x_{n-1}) +E$$ where $E\in I_{\mu^{(i)}}$.
  3. Multiplying both sides by $x_n^{i-1}$, we have $$x^\alpha=\sum_{b\in \mathcal{B}(\mu^{(i)})}c_bx_n^{i-1}b(x_1,\ldots,x_{n-1})+x_n^{i-1}E.$$ Each monomial $x_n^{i-1}b(x_1,\ldots,x_{n-1})$ is an element of $\mathcal{B}(\mu)$ by the recursive definition of the basis $\mathcal{B}(\mu)$.
  4. We next expand $x_n^{i-1}E$ as a linear combination of elements of $\mathcal{B}(\mu)$ plus an element of $I_\mu$. Since $E\in I_{\mu^{(i)}}$, we can write $$E=\mathop{\sum_{r>s_{n-k}(\mu^{(i)})}}_{i_1,\ldots,i_k\in [n-1]} a_{r,\{i_j\}}e_r(x_{i_1},\ldots,x_{i_k}).$$ We will consider each term individually, expressing $x_n^{i-1}e_r(x_{i_1},\ldots,x_{i_k})$ in terms of $I_{\mu}$ plus elements of $x_n^{i}R_\mu$ (hence reducing to a case in which the exponent of $x_n$ is larger, at which point we can repeat the algorithm until the exponent is above $\mu’_1$.)
  5. If either $n-k\lt \mu_i$, or if $n-k\ge \mu_i$ and $r>s_{n-1-k}(\mu^{(i)})+1=s_{n-1-k}(\mu)$, then $e_r(x_{i_1},\ldots,x_{i_k},x_n)\in I_\mu$ and so the expansion $$x_n^{i-1}e_r(x_{i_1},\ldots,x_{i_k})=x_n^{i-1}e_r(
    x_{i_1},\ldots,x_{i_k} ,x_n)-x_n^ie_{r-1}(x_{i_1},\ldots,x_{i_k})$$ is our desired expression in $I_\mu+x^iR_\mu$.
  6. Otherwise, if $n-k\ge \mu_i$ and $r=s_{n-1-k}(\mu^{(i)})+1=s_{n-1-k}(\mu)$, then note that we must have $i>1$. Iteratively using the identity $x_ne_r(x_{i_1},\ldots,x_{i_k})=e_{r+1}(x_{i_1},\ldots,x_{i_k},x_n)-e_{r+1}(x_{i_1},\ldots,x_{i_k})$, we can multiply by $x_n$ exactly $i-1$ times to obtain $$\begin{align*} x_n^{i-1}e_r(x_{i_1},\ldots,x_{i_k})= & \phantom{+}x_n^{i-2}e_{r+1}(x_{i_1},\ldots,x_{i_k},x_n) \\ &-x_n^{i-3}e_{r+2}(x_{i_1},\ldots,x_{i_k},x_n) \\ &+\cdots \\ &+ (-1)^{i-2}e_{r+i-1}(x_{i_1},\ldots,x_{i_k},x_n) \\ &+(-1)^{i-1}e_{r+i-1}(x_{i_1},\ldots,x_{i_k})\end{align*}$$ all of whose terms on the right hand side are in $I_\mu$.
  7. We now have expressed each term of $E$ in the form $I+x_n^iE_1$ where $I\in I_\mu$ is expressed in terms of the Tanisaki generators, and $E_1\in R_\mu$. We iterate steps 1-6 on each term of $x_n^iE_1$ and continue until we only have monomials having $x_n^h$ as a factor where $h=\mu_1’$ is the height of $\mu$.
  8. Note that $$\begin{align*}x_n^h= &\phantom{+}x_n^{h-1}e_1(x_1,\ldots,x_n) \\ &-x_n^{h-2}e_2(x_1,\ldots,x_n) \\ &+\cdots \\ &+(-1)^{h-1}e_h(x_1,\ldots,x_n) \\ &-e_h(x_1,\ldots,x_{n-1}). \end{align*}$$ The first $h$ terms above are clearly in $I_\mu$, and the last term $e_h(x_1,\ldots,x_{n-1})$ is in $I_\mu$ as well because $h>s_{n-(n-1)}(\mu)=\mu’_1-1$. The above expansion, as well as the similar relations for $x_i^h$ obtained by acting by an appropriate element of $S_n$, ensures that the process terminates.

We can now use this algorithm to express every monomial of degree at most $n(\mu)=\sum_i (i-1)\mu_i$ (which is the highest nonzero degree of $R_\mu$) in terms of the basis $\mathcal{B}(\mu)$ plus an element of $I_\mu$, expressed explicitly in terms of Tanisaki generators.

In order to minimize recursive calls, one should build this database of expansions starting with partitions of size $1$, then of size $2$, and so on. Further, for a given partition shape $\mu$, one should first expand the monomials of degree at most $n(\mu)$ with the highest possible exponent of $x_n$ (namely $x_n^h$ where $h=\mu_1’$ and then continue with the monomials having $n$th exponent $x_n^{h-1}$, then with $x_n^{h-2}$, and so on. This ensures that steps 4-8 of the algorithm only need to be run once on each summand in the error term $E$ obtained in step 3.

Expansions up to $|\mu|=3$

Using the above algorithm, I was able to quickly calculate all monomial expansions of degree at most $n(\mu)$ for all partitions $\mu$ of size at most $3$ by hand. In some cases, I worked out some of the expansions of the monomials of degree larger than $n(\mu)$, which lie entirely in $I_\mu$, as a convenience for the calculations in the subsequent cases. (In general, if one were to code this algorithm with the goal of finding the expansions for $R_\mu$, one should calculate the $I_{\lambda}$ expansions for all monomials of degree up to $n(\mu)$ for ALL partitions $\lambda\le \mu$, in order to maximize efficiency at each step.)

For $\mu=(1)$, we have $\mathcal{B}((1))=\{1\}$ and $I_\mu=(e_1(x_1))$. The expansions are:

  • $x_1=e_1(x_1)$
  • $1=1$

For $\mu=(2)$, we have $\mathcal{B}((1))=\{1\}$ and $I_\mu=(e_1(x_1,x_2),e_2(x_1,x_2),e_1(x_1),e_1(x_2))$. The expansions are:

  • $x_2=e_1(x_2)$
  • $x_1=e_1(x_1)$
  • $1=1$

For $\mu=(1,1)$, we have $\mathcal{B}((1))=\{1,x_2\}$ and $I_\mu=(e_1(x_1,x_2),e_2(x_1,x_2))$. The expansions are:

  • $x_2^2=x_2e_1(x_1,x_2)-e_2(x_1,x_2)$
  • $x_2x_1=e_2(x_1,x_2)$
  • $x_2=x_2$
  • $x_1^2=x_1e_1(x_1,x_2)-e_2(x_1,x_2)$
  • $x_1=-x_2+e_1(x_1,x_2)$
  • $1=1$

For $\mu=(3)$, we have $\mathcal{B}((1))=\{1\}$ and $I_\mu$ is the set of all partial elementary symmetric functions in three variables, so $R_\mu=\mathbb{C}$. The expansions are:

  • $1=1$

For $\mu=(2,1)$, we have $\mathcal{B}((1))=\{1,x_2,x_3\}$ and $I_\mu=(e_1,e_2,e_3,e_2(x_1,x_2),e_2(x_1,x_3),e_2(x_2,x_3))$, where $e_i$ without any variables indicates the full elementary symmetric functions using all three variables. The expansions are:

  • $x_3^2=x_3e_1-e_2+e_2(x_1,x_2)$
  • $x_3=x_3$
  • $x_2=x_2$
  • $x_1=-x_3-x_2+e_1$
  • $1=1$

For $\mu=(1,1,1)$, we have $\mathcal{B}((1))=\{1,x_2,x_3,x_3x_2,x_3^2,x_3^2x_2\}$ and $I_\mu=(e_1(x_1,x_2,x_3),e_2(x_1,x_2,x_3),e_3(x_1,x_2,x_3))$. For shorthand in this case we simply write $e_1,e_2,e_3$ since there are no strictly partial $e_r$’s in $I_\mu$. The expansions are:

  • $x_3^3=x_3^2e_1-x_3e_2+e_3$
  • $x_3^2x_2=x_3^2x_2$
  • $x_3^2x_1=-x_3^2x_2+x_3e_2-e_3$
  • $x_3^2=x_3^2$
  • $x_3x_2^2=-x_3^2x_2+(x_2x_3+x_3^2)e_1-x_3e_2$
  • $x_3x_1^2=x_3^2x_2+(x_3^2+x_3x_1)e_1
    -2x_3e_2 +e_3$
  • $x_3x_2x_1=e_3$
  • $x_3x_2=x_3x_2$
  • $x_3x_1=-x_3x_2-x_3^2+x_3e_1$
  • $x_3=x_3$
  • $x_2^3=x_2^2e_1-x_2e_2+e_3$
  • $x_2^2x_1=x_3^2x_2-x_2x_3e_1+x_2e_2$
  • $x_2^2=-x_3x_2+(x_2+x_3)e_1-e_2$
  • $x_2x_1^2=-x_3^2x_2-x_1x_3e_1+(x_1+x_3)e_2-e_3$
  • $x_2x_1=x_3^2-x_3e_1+e_2$
  • $x_2=x_2$
  • $x_1^3=x_1^2e_1-x_1e_2+e_3$
  • $x_1^2=x_3x_2+x_3^2-x_1e_1-e_2$
  • $x_1=-x_2-x_3+e_1$
  • $1=1$

Addicted to Crystal Math

Some of my recent research has centered on crystals, so here’s an introductory post, from a combinatorial point of view, on what crystals are and where they come up.

First, a clarification: the crystals I have in mind are not the chemical kind that you find in nature, nor are they any of these other uses of the word crystal on the Wikipedia disambiguation page. There is yet another use of the word, sometimes referred to as a crystal base, introduced by Kashiwara in the mid-90’s. Kashiwara developed crystal base theory as a way of understanding the $q\to 0$ limit of the representation theory of quantum groups $U_q(\mathfrak{g})$ where $\mathfrak{g}$ is a Lie algebra and $U_q(\mathfrak{g})$ is a $q$-analog of the universal enveloping algebra $U(\mathfrak{g})$. In the quantized statistical mechanics models in which quantum groups come up, the parameter $q$ is a measure of temperature, so the $q\to 0$ limit is exploring what happens in the situation of absolute zero temperature. Hence the word “crystal”, referring to what happens to many forms of matter when frozen.

Although the original motivation for crystals came from quantum groups, the combinatorial structure is also very useful in symmetric function theory. One example of a crystal is a certain weighted, edge-labeled directed graph structure on the summands that comprise a Schur polynomial. Recall that the Schur function $s_{\lambda/\mu}(x_1,\ldots,x_n)$ is given by the sum $$s_{\lambda/\mu}=\sum_{T\in \mathrm{SSYT}(\lambda/\mu)} x^T$$ where $\mathrm{SSYT}(T)$ is the set of semistandard Young tableaux of skew shape $\lambda/\mu$, and $x^T=x_1^{m_1}x_2^{m_2}\cdots$ where $m_i$ is the number of $i$’s in $T$. For instance, we have $$s_{(2,1)}(x_1,x_2,x_3)=x_1^2x_2+x_1x_2^2+x_1^2x_3+2x_1x_2x_3+x_1x_3^2+x_2^2x_3+x_2x_3^2,$$ since the eight semistandard Young tableaux of straight shape $(2,1)$ are:

We can organize these tableaux by weight in a 2-dimensional grid, since each weight is a tuple $(m_1,m_2,m_3)$ where $m_1+m_2+m_3=3$. We angle the grid so that either changing the weight by $(-1,1,0)$ or by $(0,-1,1)$ is “lowering” the weight, going downwards at a 45 degree angle to the right or left. We will then connect some of the tableaux along the edges of the grid with edges labeled by weight-lowering operators $F_1$ and $F_2$ (with inverses $E_1$ and $E_2$), shown in red and blue below:

In general $F_i$ changes the weight by $(0,0,\ldots,-1,1,0,\ldots,0)$ with the $-1$ in the $i$-th position, and we think of $F_i$ as an operator on tableaux that changes an $i$ into an $i+1$. To define $F_i(T)$, we consider the reading word $w$ of the tableau $T$, formed by concatenating the rows from bottom to top, and consider just the $i$ and $i+1$ entries of $w$. Replace each $i$ with a closed bracket ‘)’, and each $i+1$ with an open bracket ‘(‘, and cancel the brackets in matching pairs until only a word of the form $$)))\cdots ))(((\cdots (($$ (or alternatively, of the form $ii\cdots ii(i+1)(i+1)\cdots(i+1)$) remains. Then $F_i$ changes the last $i$ in this subsequence to $i+1$. For example, if $T$ looks like:

then to apply $F_1$ to $T$, we consider the reading word of $1$s and $2$s and cancel matching pairs as follows:

2 & 2 & 1 & 1 & 1 & 1 & 2 & 1 & 2 & 2 & 1 \\
( & ( & ) & ) & ) & ) & ( & ) & ( & ( & ) \\
( & & & ) & ) & ) & & & ( & & \\
& & & & ) & ) & & & ( & &

Then $F_1$ changes the rightmost $)$ that was not cancelled to $($, changing that $1$ to $2$. The word becomes:

2 & 2 & 1 & 1 & 1 & 2 & 2 & 1 & 2 & 2 & 1.

Note that $F_i$ may not be defined, if there is no closed bracket that is not cancelled. If no $F_i$ operator may be applied to a tableau $T$, we say that $T$ is lowest weight. The bottom tableau on the crystal diagram above is lowest weight for the operators $F_1,F_2$. Similarly, we can define raising operators $E_i$ as the natural partial inverses of the $F_i$ operators (simply by reversing the arrows in the crystal graph), and we define highest weight elements as those for which all $E_i$ operators are undefined.

Now here’s the interesting thing: highest weight tableaux or words are precisely the Littlewood-Richardson tableaux that we have encountered in several previous posts. Recall that a tableau is Littlewood-Richardson if its reading word $w=w_1\cdots w_n$ is ballot, meaning that every tail of the word (of the form $w_k\cdots w_n$ for some $k$) has at least as many $1$’s as $2$’s, at least as many $2$’s as $3$’s, and so on. In other words, when reading the word from right to left, the smaller letters always occur more frequently than larger letters. For instance, the word $$12122111$$ is ballot, and it is not hard to see that ballotness in a word with just $1$’s and $2$’s is equivalent to the condition that every $2$ is bracketed with a $1$, so that $E_1$ is undefined.

The key fact about tableau crystals is as follows:

Every connected component of a crystal graph has a unique highest weight element.

For instance, here is the crystal graph for the skew shape $(3,1)/(1)$:

Note that each connected component has a Littlewood-Richardson tableau at the top, and no other elements are killed by all raising operators. Notice also that the first graph above has the same structure as that of the shape $(2,1)$ crystal drawn above. This is no coincidence: another key fact about the operators $F_i$ and $E_i$ is that they commute with Jeu de Taquin rectification, which is easily proven using Knuth equivalence. The tableaux in the shape $(2,1)$ crystal are simply the Jeu de Taquin rectifications of those in the corresponding component of the $(3,1)/(1)$ graph. Similarly, the tableaux in the second component rectify to form the tableau crystal for shape $(3)$. It follows that, if we sum the monomial weights in the graph, we obtain the equation $$s_{(3,1)/(1)}=s_{(2,1)}+s_{(3)},$$ which is an instance of the famous Littlewood-Richardson rule.

In general, let $c^{\lambda}_{\mu\nu}$ be the number of Littlewood-Richardson tableaux of shape $\lambda/\mu$ and content $\nu$. Then recall that the Littlewood-Richardson rule for multiplying Schur functions states:

$$s_\mu s_\nu = \sum_{\lambda} c^{\lambda}_{\mu\nu} s_\lambda,$$

or equivalently, after a bit of symmetric function theory manipulation,

$$s_{\lambda/\mu}=\sum_{\nu} c^{\lambda}_{\mu\nu} s_\nu.$$

The latter equation now follows immediately from the facts about crystals listed above, as we saw for the shape $(3,1)/(1)$. We can also prove the product version directly using crystals, by introducing a tensor product. There is a natural tensor product on crystals coming from their correspondence with Lie algebra representations, which we will define in general on the next page. In combinatorial terms, we can define the tensor product of crystals of tableaux by defining the concatenation of two tableaux $T$ and $S$ to be the tableau $T\cdot S$ formed by drawing $S$ above and to the right of $T$, with the lower left corner of $S$ matching the upper right corner of $T$. For instance, tableaux of skew shape $(3,1)/(1)$ can be formed by concatenating a tableau of shape $(1)$ with a tableau of shape $(2)$. (Note that this concatenates the reading words of $T$ and $S$ as well, and induces the plactic monoid structure on Knuth equivalence classes of words.)

Then the tensor product $\mathcal{B}_\mu \otimes \mathcal{B}_\nu$ of two tableaux crystals on shapes $\mu$ and $\nu$ is the crystal on all words of the form $T\cdot S$ for $T\in \mathcal{B}_\mu$ and $S\in \mathcal{B}_\lambda$, with operators $F_i$ connecting the nodes appropriately. It turns out that every connected component in the resulting crystal are again full crystals of tableaux, with Schur functions as their weight generating functions. We thus obtain a crystal-theoretic interpretation of the multiplicative Littlewood-Richardson rule. Namely, we have
$$\mathcal{B}_\mu \otimes \mathcal{B}_\nu=\bigcup c^\lambda_{\mu\nu} \mathcal{B}_{\lambda}.$$

For instance, $\mathcal{B}_{(1)}\otimes \mathcal{B}_{(2)}$ is the crystal whose graph is precisely the diagram of $\mathcal{B}_{(3,1)/(1)}$ shown above, so $$\mathcal{B}_{(1)}\otimes \mathcal{B}_{(2)}=\mathcal{B}_{(2,1)}\cup \mathcal{B}_{(3)},$$ matching the Schur expansion $s_{(1)}s_{(2)}=s_{(2,1)}+s_{(3)}.$

With these examples in mind, on the next page we will define the tensor category of crystals over any semisimple Lie algebra.

The CW complex structure of the Grassmannian

In a previous post, we briefly described the complex Grassmannian $\mathrm{Gr}(n,k)$ as a CW complex whose cells are the Schubert cells with respect to a chosen flag. We’ll now take a closer look at the details of this construction, along the lines of the exposition in this master’s thesis of Tuomas Tajakka (chapter 3) or Hatcher’s book Vector Bundles and $K$-theory (page 31), but with the aid of concrete examples.

Background on CW complexes

Let’s first review the notion of a CW complex (or cell complex), as described in Hatcher’s Algebraic Topology.

An $n$-cell is any topological space homeomorphic to the open ball $B_n=\{v\in\mathbb{R}^n:|v|<1\}$ in $\mathbb{R}^n$. Similarly an $n$-disk is a copy of the closure $\overline{B_n}=\{v\in \mathbb{R}^n:|v|\le 1\}$.

To construct a cell complex, one starts with a set of points called the $0$-skeleton $X^0$, and then attaches $1$-disks $D$ via continuous boundary maps from the boundary $\partial D$ (which simply consists of two points) to $X^0$. The result is a $1$-skeleton $X^1$, which essentially looks like a graph:

This an then be extended to a $2$-skeleton by attaching $2$-disks $D$ via maps from the boundary $\partial D$ (which is a circle) to $X^1$. Note that, as in the picture below, the attaching map may collapse the entire boundary to a point, making the circle into a balloon-like shape.

In general the $n$-skeleton $X^n$ is formed from $X^{n-1}$ by attaching a set of $n$-disks by maps from their boundaries to $X^{n-1}$.

More precisely, to form $X^n$ from $X^{n-1}$, we start with a collection of $n$-disks $D^n_\alpha$ and continuous attaching maps $\varphi_\alpha:\partial D_\alpha^n\to X^{n-1}$. Then $$X^n=\frac{X^{n-1}\sqcup \bigsqcup_\alpha D_\alpha^n}{\sim}$$ where $\sim$ is the identification $x\sim \varphi_\alpha(x)$ for $x\in \partial D^n_\alpha$. The cell complex is $X=\bigcup_n X^n$, which may be simply $X=X^n$ if the process stops at stage $n$.

By the construction, the points $X^0$ along with the open $i$-cells associated to the $i$-disks in $X^i$ for each $i$ are disjoint and cover the cell complex $X$. The topology on $X$ is given by the rule that $A\subset X$ is open if and only if $A\cap X^n$ is open in $X^n$ for all $n$.

The real projective plane

As an example, consider the real projective plane $\mathbb{P}^2_{\mathbb{R}}$. It has a cell complex structure in which $X^0=\{(0:0:1)\}$ is a single point, and $X^1=X^0\sqcup \{(0:1:\ast)\}$ is topologically a circle formed by attaching a $1$-cell to the point at both ends.

Then, $X^2$ is formed by attaching a $2$-cell $\mathbb{R}^2$ to the circle such that the boundary wraps around the $1$-cell twice. Intuitively, this is because the points of the form $(1:xt:yt)$ as $t\to \infty$ and as $t\to -\infty$ both approach the same point in $X^1$, so the boundary map must be a $2$-to-$1$ mapping.

To make this analysis more rigorous, consider $\mathbb{P}^2_{\mathbb{R}}$ as the quotient $(\mathbb{R}^3\setminus\{(0,0,0)\})/\sim$ where $\sim$ is the equivalence relation given by scalar multiplication. In particular there is a natural continuous quotient map $$q:\mathbb{R}^3\setminus \{0\}\to \mathbb{P}^2_{\mathbb{R}}.$$ We can now take $n$-disks in $\mathbb{R}^3\setminus\{0\}$ and map them into $\mathbb{P}^2_{\mathbb{R}}$ via $q$, which will give the CW complex structure described above.

In particular, the point $(0,0,1)$ maps to $(0:0:1)$, giving the $0$-skeleton. Then consider the set of points $$\{(0,a,b):a^2+b^2=1, a\ge 0\}\subseteq \mathbb{R}^3\setminus \{0\},$$ which is a half-circle and thus homeomorphic to a $1$-disk. The quotient map $q$ gives an attaching map of the boundary of this $1$-disk to $(0:0:1)$, by mapping both endpoints $(0,0,1)$ and $(0,0,-1)$ to $(0:0:1)$. It also maps the interior of this disk bijectively to the $1$-cell $(0:1:\ast)$ in $\mathbb{P}^2_{\mathbb{R}}$.

Finally, consider the set $$D=\{(a,b,c):a^2+b^2+c^2=1,a\ge 0\},$$ which is a half-sphere and thus homeomorphic to a $2$-disk. Then under $q$, the interior of $D$ maps to the $2$-cell $(1:\ast,\ast)$, and $q$ gives a $2$-to-$1$ attaching map of the boundary $\partial D=\{(0,b,c): b^2+c^2=1\}$ onto the $1$-skeleton $\{(0:x:y)\}$. Indeed, given a desired ratio $(0:x:y)$, there are two points $(0,b,c)\in \partial D$ mapping to it, since a line with a given slope intersects the unit circle in exactly two points.

The complex projective plane

A nearly identical construction to the one above can help us realize the complex projective plane $\mathbb{P}_{\mathbb{C}}^2$ as a CW complex as well, though the structure is somewhat different.

In particular, we can again start with the point $(0:0:1)$ as our $0$-skeleton, and attach the $1$-cell $(0:1:\ast)$ as follows. Let $$q:\mathbb{C}^3\setminus\{0\}\to \mathbb{P}_{\mathbb{C}}^2$$ be the natural quotient map, and define $$D=\{(0,b,c):|(0,b,c)|=1, b\in \mathbb{R}_{\ge 0}\}\subseteq \mathbb{C}^3\setminus \{0\}.$$ Then we can write $c=x+yi$ and the conditions defining $D$ are equivalent to $b,x,y\in \mathbb{R}$, $b\ge 0$, $b^2+x^2+y^2=1$ (since $|c|^2=c\overline{c}=x^2+y^2$). Thus $D$ is a real half-sphere and hence a $2$-disk. The quotient map takes the interior of $D$, on which $b>0$, bijectively and continuously to $(0:1:\ast)$, while collapsing its entire circular boundary to the $0$-skeleton as the attaching map.

Finally, we can similarly take the image of $$E=\{(a,b,c): |(a,b,c)|=1, a\in \mathbb{R}_{\ge 0}\},$$ which is homeomorphic to a 4-disk, to obtain the remaining points in $\mathbb{P}_{\mathbb{C}}^2$. (Indeed, setting $b=x+yi$ and $c=z+wi$ for real numbers $x,y,z,w$, the space $E$ is cut out by the equations $a^2+x^2+y^2+z^2+w^2=1$, $a\ge 0$ in $5$-dimensional real space, so it is a $4$-hemisphere.) The attaching map on the boundary is “circle-to-one”: every point in the $1$-skeleton has a circle as its preimage in $E$. (Can you prove this?)

Note that these constructions can be continued for higher dimensional projective spaces over $\mathbb{C}$ or $\mathbb{R}$.

The Grassmannian

We now can analyze the complex Grassmannian $\mathrm{Gr}(n,k)$ – the space of $k$-dimensional subspaces of $\mathbb{C}^n$ – in a similar manner to that of a complex projective space. In fact, $\mathbb{P}_{\mathbb{C}}^n=\mathrm{Gr}(n+1,1)$, so we have already done this for some Grassmannians.

For $\mathrm{Gr}(n,k)$ with $k\ge 2$, we use a similar quotient map, but this time from the Stiefel manifold $V_k=V_k(\mathbb{C}^n)$, the space of orthonormal $k$-frames. That is, $V_k$ is the space of all tuples of linearly independent vectors $(v_1,\ldots,v_k)\in (\mathbb{C}^n)^k$ for which $$\langle v_i, v_j\rangle=\delta_{ij}$$ for all $i$ and $j$. (Here $\langle-,-\rangle$ can be taken as the standard Hermitian inner form $\langle (z_1,\ldots,z_n), (w_1,\ldots,w_n) \rangle=\sum z_i \overline{w_i}$.)

Now, there is a natural quotient map $q:V_k\to \mathrm{Gr}(n,k)$ given by row-reducing the matrix formed by the $k$-frame, as we did in this post. Note that $V_1=\mathbb{C}^n\setminus\{0\}$ is the space we used above for projective space, and that $q$ matches our map above in this case.

We will not go into full details of the CW complex construction here, since Tajakka’s paper does it so nicely already. Instead, let’s outline the construction for one of the nontrivial attaching maps in $\mathrm{Gr}(4,2)$.

Recall that in $\mathrm{Gr}(4,2)$, the Schubert varieties are indexed by partitions whose Young diagram fits inside a $2\times 2$ rectangle. The Schubert variety $\Omega_{(2,2)}$ (with respect to the standard flag) consists of the single point of the Grassmannian represented by the row-reduced matrix

$$\left[\begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{array}\right]$$

Use this point as the $0$-skeleton of the Grassmannian. For the $2$-skeleton, we attach the $2$-cell given by $$\Omega_{(2,1)}^{\circ}=\left[\begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 1 & \ast & 0 \end{array}\right]$$ to the $0$-skeleton, which is the unique boundary point in its closure. So the $1$-skeleton is actually homeomorphic to the complex projective line.

We now construct the attaching maps that show that if we include the two $4$-dimensional (over $\mathbb{R}$) Schubert cells given by $\Omega_{(1,1)}^{\circ}$ and $\Omega_{(2)}^{\circ}$, we get a $4$-dimensional CW complex. Consider the cell $$\Omega_{(1,1)}=\left[\begin{array}{cccc} 0 & 0 & 1 & \ast \\ 0 & 1 & 0 & \ast \end{array}\right].$$ Define $E$ to be the following subspace of $V_2(\mathbb{C}^4)$:
$$D=\{(v_1,v_2)\in V_2: v_1=(0,0,a,b), v_2=(0,c,d,e)\text{ for some }b,d,e\in \mathbb{C}, a,c\in \mathbb{R}_{\ge 0}\}.$$ Then $D$ maps under $q$ to $\Omega_{(1,1)}=\overline{\Omega_{(1,1)}^\circ}$, its interior mapping to $\Omega_{(1,1)}^\circ$.

We now wish to show that $D$ is topologically a $4$-disk, making $q$ an attaching map. This is the tricky part in general, but we’ll describe the proof explicitly in this special case.

Let $$D_1=\{(0,0,a,b):a\ge 0, b\in \mathbb{C}, a^2+b\overline{b}=1\}$$ and
$$D_2=\{(0,x,0,z):x\ge 0, z\in \mathbb{C}, x^2+z\overline{z}=1\}.$$ Then both are hemispheres in $\mathbb{R}^3$ and hence homeomorphic to $2$-disks, so their Cartesian product $D_1\times D_2$ is homeomorphic to a $4$-disk. So it suffices to show that $D_1\times D_2$ is homeomorphic to $D$.

In particular, given a pair $(v,u)\in D_1\times D_2$, we can apply a certain rotation $T_v$ to $u$ (depending continuously on $v$) that transforms $(v,u)$ into an independent, orthonormal pair of vectors $(v,T_v(u))\in D$. We can define $T_v$ to be the unique rotation that takes $(0,0,1,0)$ to $v$ and fixes all vectors orthogonal to both $(0,0,1,0)$ and $v$. Then since $(0,0,1,0)$ is already orthogonal to $u$ by the definition of $D_2$, this will give a pair $(v,T_v(u))$ of orthonormal vectors, which are independent since $u$ and $v$ are. Tajakka proves (in much more generality) that this map gives the required homeomorphism.

Back to $\mathbb{R}$eality

The real Grassmannian also has a CW complex structure, given by an almost identical construction to the one above (see Hatcher, page 31). Let’s analyze the map described above, replacing all complex coordinates with real ones.

Let $v=(0,0,a,b)$ with $a^2+b^2=1$ and $a\ge 0$ as above. The map $T_v$, in coordinates, sends $(0,x,0,w)$ to $(0,x,-wb,wa)$. Keeping in mind that $a^2+b^2=1$ and $x^2+w^2=1$, it is easy to see that the vectors $(0,0,a,b)$ and $(0,x,-wb,wa)$ form an orthonormal $2$-frame. The disk $D$ in this setting is the set of all orthonormal $2$-frames of this form with $a,x\ge 0$.

Taking the image of $D$ under the quotient map $q:V_2(\mathbb{R}^4)\to \mathrm{Gr}_{\mathbb{R}}(4,2)$, it’s not hard to show that the restriction to the boundary is a $2$-to-$1$ map, as in the construction of the real projective plane. Indeed, if $q((0,0,a,b),(0,x,0,w))$ is the point of the Grassmannian represented by $$\left[\begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 1 & c & 0 \end{array}\right],$$ this forces $a=0$, $b=\pm 1$, $x=1$, $w=-c/b$. So there are two preimages of this point, determined by the two possible values for $b$.

I haven’t worked out (or found in a reference) what the degrees of the attaching maps are for the higher dimensional Schubert cells. If you work out a higher dimensional example, or know of a reference for this, please post it in the comments below!

Thanks to Jake Levinson for enlightening mathematical discussions that motivated me to write this post.

Higher specht polynomials

There are polynomials. There are Specht polynomials. And then there are higher Specht polynomials.

A colleague recently pointed out to me the results of this paper by Ariki, Terasoma, and Yamada, or more concisely summarized here. The authors give a basis of the ring of coinvariants $$R_n=\mathbb{C}[x_1,\ldots,x_n]/(e_1,\ldots,e_n)$$ which, unlike the other bases we’ve discussed on this blog so far, respects the decomposition into irreducible $S_n$-modules. These basis elements include the ordinary Specht polynomials, but also some polynomials of higher degrees, hence the name “higher” Specht polynomials.

Background: What is a Specht polynomial?

The construction of the irreducible representations of the symmetric group $S_n$ (called Specht modules) is often described in terms of “polytabloids”, but can be equivalently described in terms of a basis of Specht polynomials.

Before we describe this basis, recall that the dimension of the irreducible representation $V_\lambda$ of $S_n$ indexed by the partition $\lambda$ is equal to the number $f^\lambda$ of Standard Young Tableaux (SYT) of shape $\lambda$, i.e., the number of fillings of the Young diagram of $\lambda$ with the numbers $1,\ldots,n$ in such a way that the entries are increasing down rows and across columns. An example is shown below for $n=7$ and $\lambda=(3,3,1)$:

Given a standard Young tableau $T$, the Specht polynomial $F_T$ is defined as the following product taken over all of its columns $C$
$$F_T(x_1,\ldots,x_n)=\prod_{C} \prod_{i\lt j\in C}(x_j-x_i)$$

For instance, in the above tableau, $$F_T(x_1,\ldots,x_7)=(x_3-x_1)(x_4-x_1)(x_4-x_3)(x_6-x_2)(x_7-x_5).$$

In other words, the Specht polynomial for $T$ is the product of the Vandermonde determinants given by each of its columns’ entries. It is known that the set of polymomials $F_T$, where $T$ ranges over all SYT’s of shape $\lambda$, form a basis of their span, and their span is isomorphic to $V_\lambda$ as an $S_n$-module (here the $S_n$-action is the usual action on the variables $x_1,\ldots,x_n$).

Specht polynomials in the coinvariant ring

As described in detail in this post, the coinvariant ring is the $S_n$-module $$R_n=\mathbb{C}[x_1,\ldots,x_n]/(e_1,\ldots,e_n)$$ where $e_i$ is the $i$th elementary symmetric function in the variables $x_1,\ldots,x_n$.

The ring $R_n$ inherits the action of $S_n$ on the polynomial ring, and we saw in the previous post on the coinvariant ring that it is isomorphic to the regular representation as an $S_n$-module. It is also graded by polynomial degree, with Hilbert series equal to $(n)_q!$. There are many known bases of $n!$ elements that respect this grading, but the ones we have described in previous posts do not respect the $S_n$-action. In particular, can we find an explicit basis which partitions into bases of the irreducible components in $R_n$?

To start, the Specht polynomials $F_T$ defined above are nonzero in $R_n$ and span one copy of each Specht module. But we know that in the regular representation, there are $f^\lambda$ copies of the irreducible representation $V_\lambda$ for each $\lambda$, and this construction only gives one copy of each. In fact, since the copies of $V_\lambda$ must be alternating in the column variables, they are the lowest-degree copies of each. Hence, we need higher Specht polynomials.

To define them, we first need to understand the RSK correspondence and the charge statistic.

The tools: RSK and charge

Somehow, in all of my years of blogging, I have never actually described the RSK (Robinson-Schensted-Knuth) correspondence. By the nature of the regular representation of $S_n$, we must have $$\sum_{\lambda \vdash n} (f^{\lambda})^2=n!$$ since each irreducible representation $V_\lambda$ of dimension $f^\lambda$ occurs exactly $f^{\lambda}$ times, and the regular representation has dimension $n!$. Recall also that $f^\lambda$ is the number of standard Young tableaux of shape $\lambda$.

The RSK correspondence gives a combinatorially described bijection between pairs of standard Young tableaux of the same partition shape $\lambda$ of $n$ (for some $\lambda$) and permutations of $n$. It’s a beautiful bijection, but since the readers of this post will most likely already be familiar with it, I will simply refer the reader to Simon Rubenstein-Salzedo’s clearly-written introduction to RSK.

The upshot is that there is one higher Specht polynomial for each pair of standard Young tableaux $(S,T)$ of the same shape, denoted $F_T^S$. The degree of $F_T^S$ is given by the charge statistic on $S$, defined as follows. Let $w(S)$ be the reading word of $S$, formed by concatenating the rows of $S$ from bottom to top, and assign subscripts $s(i)$ to the letters $i$ of the reading word recursively as follows. Assign the letter $1$ the subscript $s(1)=0$, and for each $i\ge 1$, if $i+1$ is to the right of $i$ then assign it $s(i+1)=s(i)$, and otherwise $s(i+1)=s(i)+1$. The charge of $S$ is the sum of the subscripts $s(i)$.

For example, let $S$ and $T$ be the tableaux given below:

Then the charge of $S$ is computed by first forming the reading word $$w(S)=3267145$$ and then placing subscripts starting from the letter $1$ according to the rules above:
$$3_2 2_1 6_3 7_3 1_0 4_2 5_2$$
Then the charge of $S$ is $2+1+3+3+0+2+2=13$.

The lowest-degree (minimal charge) case occurs when $S=S_0$ is the canonical choice having elements $1,2,3,\ldots,n$ written in order left to right in each row from top to bottom, as in:

In this case we will recover the standard Specht polynomials, and the others are the higher Specht polynomials.

Construction of the Higher Specht polynomials

We now have the tools to describe the construction. Given a pair $(S,T)$ of standard Young tableaux of the same shape $\lambda$, let $x_T^{\mathrm{charge}(S)}$ be the monomial $x_1^{i_1}\cdots x_n^{i_n}$ where $i_k$ is the charge subscript on the cell in $S$ corresponding to the cell containing $k$ in $T$. For example, if we fill the boxes of $S$ in the example above with their corresponding subscripts in the charge computation, it looks like:

and we have $x_T^{\mathrm{charge}(S)}=x_2^2x_3x_4^2x_5^2x_6^3x_7^3$.

Then the higher Specht polynomial $F_T^S$ is defined as $$F_T^S(x_1,\ldots,x_n)=\sum_{\tau\in \mathrm{Col}(T)}\sum_{\sigma\in \mathrm{Row}(T)} \mathrm{sgn}(\tau)\tau\sigma x_T^{\mathrm{charge}(S)}$$ where $\mathrm{Row}(T)$ and $\mathrm{Col}(T)$ are the subgroups of $S_n$ generated by the permutations of the rows of $T$ and the columns of $T$, respectively.

Notice that this polynomial has degree given by the charge of $S$, which is minimal when $S=S_0$ is the canonical tableau of shape $\lambda$ as described above. And when $S=S_0$, $x_T^{\mathrm{charge}(S)}$ is the product $\prod_{i} x_i^{h_i}$ where $h_i$ is the row of the entry $i$ in $T$. It’s not hard to see that antisymmetrizing this monomial over the column permutations recovers the ordinary Specht polynomial $F_T$. Furthermore, symmetrizing across rows keeps the monomial fixed, and so the higher Specht polynomial $F_T^{S_0}$ is simply a constant multiple of the ordinary Specht polynomial $F_T$.

In general the Specht module $V_\lambda$ occurs once for each tableau $S$ of shape $\lambda$, giving the right multiplicities for each irreducible representation.

Thanks to some Sage code by Nicolas Thiery, we can use this formula to efficiently compute the higher Specht polynomials. Here are the polynomials for $n=3$:

6 \\
2(y-x) \\
2(z-x) \\
z(y-x) \\
y(z-x) \\

Note that the first three and the last are (scalar multiples of) ordinary Specht polynomials, and $z(y-x)$ and $y(z-x)$ are two higher Specht polynomials giving another copy of $V_{(2,1)}$.


In their paper, Ariki, Terasoma, and Yamada actually construct a more general version of the higher Specht polynomials that respects the restricted $S_\mu$-module structure of $R_n$ where $S_{\mu_1}\times \cdots \times S_{\mu_k}\subseteq S_n$ is any Young subgroup. It’s also natural to ask whether there are generalizations of the higher Specht modules to related rings such as the Garsia-Procesi modules or even (we can dream!) the Garsia-Haiman modules.

Shifted partitions and the Orthogonal Grassmannian

In a previous post, we discussed Schubert calculus in the Grassmannian and the various intersection problems it can solve. I have recently been thinking about problems involving the type B variant of Schubert calculus, namely, intersections in the orthogonal Grassmanian. So, it’s time for a combinatorial introduction to the orthogonal Grassmannian!

What is the orthogonal Grassmannian?

In order to generalize Grassmannians to other Lie types, we first need to understand in what sense the ordinary Grassmannian is type A. Recall from this post that the complete flag variety can be written as a quotient of $G=\mathrm{GL}_n$ by a Borel subgroup $B$, such as the group of upper-triangular matrices. It turns out that all partial flag varieties, the varieties of partial flags of certain degrees, can be similarly defined as a quotient $$G/P$$ for a parabolic subgroup $P$, namely a closed intermediate subgroup $B\subset P\subset G$.

The (ordinary) Grassmannian $\mathrm{Gr}(n,k)$, then, can be thought of as the quotient of $\mathrm{GL}_n$ by the parabolic subgroup $S=\mathrm{Stab}(V)$ where $V$ is any fixed $k$-dimensional subspace of $\mathbb{C}^n$. Similarly, we can start with a different reductive group, say the special orthogonal group $\mathrm{SO}_{2n+1}$, and quotient by parabolic subgroups to get partial flag varieties in other Lie types.

In particular, the orthogonal Grassmannian $\mathrm{OG}(2n+1,k)$ is the quotient $\mathrm{SO}_{2n+1}/P$ where $P$ is the stabilizer of a fixed isotropic $k$-dimensional subspace $V$. The term isotropic means that $V$ satisfies $\langle v,w\rangle=0$ for all $v,w\in V$ with respect to a chosen symmetric bilinear form $\langle,\rangle$.

The isotropic condition, at first glance, seems very unnatural. After all, how could a nonzero subspace possibly be so orthogonal to itself? Well, it is first important to note that we are working over $\mathbb{C}$, not $\mathbb{R}$, and the bilinear form is symmetric, not conjugate-symmetric. So for instance, if we choose a basis of $\mathbb{C}^{2n+1}$ and define the bilinear form to be the usual dot product $$\langle (a_1,\ldots,a_{2n+1}),(b_1,\ldots,b_{2n+1})\rangle=a_1b_1+a_2b_2+\cdots+a_{2n+1}b_{2n+1},$$ then the vector $(3,5i,4)$ is orthogonal to itself: $3\cdot 3+5i\cdot 5i+4\cdot 4=0$.

While the choice of symmetric bilinear form does not change the fundamental geometry of the orthogonal Grassmannian, one choice in particular makes things easier to work with in practice: the “reverse dot product” given by
$$\langle (a_1,\ldots,a_{2n+1}),(b_1,\ldots,b_{2n+1})\rangle=\sum_{i=1}^{2n+1} a_ib_{2n+1-i}.$$ In particular, with respect to this symmetric form, the “standard” complete flag $\mathcal{F}$, in which $\mathcal{F}_i$ is the span of the first $i$ rows of the identity matrix $I_{2n+1}$, is an orthogonal flag, with $\mathcal{F}_i^\perp=\mathcal{F}_{2n+1-i}$ for all $i$. Orthogonal flags are precisely the type of flags that are used to define Schubert varieties in the orthogonal grassmannian.

Other useful variants of the reverse dot product involve certain factorial coefficients, but for this post this simpler version will do.

Going back to the main point, note that isotropic subspaces are sent to other isotropic subspaces under the action of the orthorgonal group: if $\langle v,w\rangle=0$ then $\langle Av,Aw\rangle=\langle v,w\rangle=0$ for any $A\in \mathrm{SO}_{2n+1}$. Thus orthogonal Grassmannian $\mathrm{OG}(2n+1,k)$, which is the quotient $\mathrm{SO}_{2n+1}/\mathrm{Stab}(V)$, can be interpreted as the variety of all $k$-dimensional isotropic subspaces of $\mathbb{C}^{2n+1}$.

Schubert varieties and row reduction in $\mathrm{OG}(2n+1,n)$

Just as in the ordinary Grassmannian, there is a Schubert cell decomposition for the orthogonal Grassmannian, and the combinatorics of Schubert varieties is particularly nice in the case of $\mathrm{OG}(2n+1,n)$ in which the orthogonal subspaces are “half dimension” $n$. (In particular, this corresponds to the “cominuscule” type in which the simple root associated to our maximal parabolic subgroup is the special root in type $B$. See the introduction here or this book for more details.)

Recall that in $\mathrm{Gr}(2n+1,n)$, the Schubert varieties are indexed by partitions $\lambda$ whose Young diagram fit inside an $n\times (n+1)$ rectangle. Suppose we divide this rectangle into two staircases as shown below using the red cut, and only consider the partitions $\lambda$ that are symmetric with respect to the reflective map taking the upper staircase to the lower.

We claim that the Schubert varieties of the orthogonal Grassmannian are indexed by the “shifted partitions” formed by ignoring the lower half of these symmetric partition diagrams. In fact, the Schubert varieties consist of the isotropic elements of the ordinary Schubert varieties, giving a natural embedding $\mathrm{OG}(2n+1,n)\to \mathrm{Gr}(2n+1,n)$ that respects the Schubert decompositions.

To get a sense of how this works, let’s look at the example of the partition $(4,3,1)$ shown above, in the case $n=4$. As described in this post, in the ordinary Grassmannian, the Schubert cell $\Omega_\lambda^{\circ}$ with respect to the standard flag is given by the set of vector spaces spanned by the rows of matrices whose reduced row echelon form looks like:

Combinatorially, this is formed by drawing the staircase pattern shown at the left, then adding the partition parts $(4,3,1)$ to it, and placing a $1$ at the end of each of the resulting rows for the reduced row echelon form.

Now, which of these spaces are isotropic? In other words, what is $\Omega_\lambda^{\circ}\cap \mathrm{OG}(2n+1,n)$? Well, suppose we label the starred entries as shown, where we have ommited the $0$’s to make the entries more readable:

Then I claim that the entries $l,j,k,h,i,e$ are all uniquely determined by the values of the remaining variables $a,b,c,d,f,g$. Thus there is one isotropic subspace in this cell for each choice of values $a,b,c,d,f,g$, corresponding to the “lower half” of the partition diagram we started with:

Indeed, let the rows of the matrix be labeled $\mathbf{1},\mathbf{2},\mathbf{3},\mathbf{4}$ from top to bottom as shown, and suppose its row span is isotropic. Since row $\mathbf{1}$ and $\mathbf{4}$ are orthogonal with respect to the reverse dot product, we get the relation $$l+a=0,$$ which expresses $l=-a$ in terms of $a$.

Now, rows $\mathbf{2}$ and $\mathbf{4}$ are also orthogonal, which means that $$b+k=0,$$ so we can similarly eliminate $k$. From rows $\mathbf{2}$ and $\mathbf{3}$, we obtain $f+j=0$, which expresses $j$ in terms of the lower variables. We then pair row $\mathbf{3}$ with itself to see that $h+g^2=0$, eliminating $h$, and finally pairing $\mathbf{3}$ with $\mathbf{4}$ we have $i+gc+d=0$, so $i$ is now expressed in terms of lower variables as well.

Moreover, these are the only relations we get from the isotropic condition – any other pairings of rows give the trivial relation $0=0$. So in this case the Schubert variety restricted to the orthogonal Grassmannian has half the dimension of the original, generated by the possible values for $a,b,c,d,f,g$.

General elimination argument

Why does this elimination process work in general, for a symmetric shape $\lambda$? Label the steps of the boundary path of $\lambda$ by $1,2,3,\ldots$ from SW to NE in the lower left half, and label them from NE to SW in the upper right half, as shown:

Then the labels on the vertical steps in the lower left half give the column indices of the $1$’s in the corresponding rows of the matrix. The labels on the horizontal steps in the upper half, which match these labels by symmetry, give the column indices from the right of the corresponding starred columns from right to left.

This means that the $1$’s in the lower left of the matrix correspond to the opposite columns of those containing letters in the upper right half. It follows that we can use the orthogonality relations to pair a $1$ (which is leftmost in its row) with a column entry in a higher or equal row so as to express that entry in terms of other letters to its lower left. The $1$ is in a lower or equal row in these pairings precisely for the entries whose corresponding square lies above the staircase cut. Thus we can always express the upper right variables in terms of the lower left, as in our example above.

Isotropic implies symmetric

We can now conversely show that if a point of the Grassmannian is isotropic, then its corresponding partition is symmetric about the staircase cut. Indeed, we claim that if the partition is not symmetric, then two of the $1$’s of the matrix will be in opposite columns, resulting in a nonzero reverse dot product between these two rows.

To show this, first note that we cannot have a $1$ in the middle column, for otherwise its row would not be orthogonal to itself. Therefore, of the remaining $2n$ columns, some $n$ of them have a $1$ in them, and the complementary columns must be the other $n$, the ones without a $1$. But this is the same combinatorial data as choosing some $k$ columns from the first $n$ columns, and the remaining columns being forced to be a pivot column if their opposite is not and vice versa. This implies that the partition is symmetric.

Coming soon: Combinatorics of shifted partitions and tableaux

From the above analysis, it follows that the orthogonal Grassmannian is partitioned into Schubert cells given by the data of a shifted partition, the upper half of a symmetric partition diagram:

Such a partition is denoted by its distinct “parts”, the number of squares in each row, thought of as the parts of a partition with distinct parts shifted over by the staircase. For instance, the shifted partition above is denoted $(3,1)$.

The beauty of shifted partitions is that so much of the original tableaux combinatorics that goes into ordinary Schubert calculus works almost the same way for shifted tableaux and the orthogonal Grassmannian. We can define jeu de taquin, Knuth equivalence, and dual equivalence on shifted tableaux, there is a natural notion of a Littlewood-Richardson shifted tableau, and these notions give rise to formulas for both intersections of Schubert varieties in the orthogonal Grassmannian and to products of Schur $P$-functions or Schur $Q$-functions, both of which are specializations of the Hall-Littlewood polynomials.

I’ll likely go further into the combinatorics of shifted tableaux in future posts, but for now, I hope you enjoyed the breakdown of why shifted partitions are the natural indexing objects for the Schubert decomposition of the orthogonal Grassmannian.

The structure of the Garsia-Procesi modules $R_\mu$

Somehow, in all the time I’ve posted here, I’ve not yet described the structure of my favorite graded $S_n$-modules. I mentioned them briefly at the end of the Springer Correspondence series, and talked in depth about a particular one of them – the ring of coinvariants – in this post, but it’s about time for…

The Garsia-Procesi modules!

Also known as the cohomology rings of the Springer fibers in type $A$, or as the coordinate ring of the intersection of the closure of a nilpotent conjugacy class of $n\times n$ matrices with a torus, with a link between these two interpretations given in a paper of DeConcini and Procesi. But the work of Tanisaki, and Garsia and Procesi, allows us to work with these modules in an entirely elementary way.

Using Tanisaki’s approach, we can define $$R_{\mu}=\mathbb{C}[x_1,\ldots,x_n]/I_{\mu},$$ where $I_{\mu}$ is the ideal generated by the partial elementary symmetric functions defined as follows. Recall that the elementary symmetric function $e_d(z_1,\ldots,z_k)$ is the sum of all square-free monomials of degree $d$ in the set of variables $z_i$. Let $S\subset\{x_1,\ldots,x_n\}$ with $|S|=k$. Then the elementary symmetric function $e_r(S)$ in this subset of the variables is called a partial elementary symmetric function, and we have $$I_{\mu}=(e_r(S) : k-d_k(\mu) \lt r \le k, |S|=k).$$ Here, $d_k(\mu)=\mu’_n+\mu’_{n-1}+\cdots+ \mu’_{n-k+1}$ is the number of boxes in the last $k$ columns of $\mu$, where we pad the transpose partition $\mu’$ with $0$’s so that it has $n$ parts.

This ring $R_\mu$ inherits the natural action of $S_n$ on $\mathbb{C}[x_1,\ldots,x_n]$ by permuting the variables, since $I_\mu$ is fixed under this action. Since $I_\mu$ is also a homogeneous ideal, $R_\mu$ is a graded $S_n$-module, graded by degree.

To illustrate the construction, suppose $n=4$ and $\mu=(3,1)$. Then to compute $I_{\mu}$, first consider subsets $S$ of $\{x_1,\ldots,x_4\}$ of size $k=1$. We have $d_1(\mu)=0$ since the fourth column of the Young diagram of $\mu$ is empty (see image below), and so in order for $e_r(S)$ to be in $I_\mu$ we must have $1-0\lt r \le 1$, which is impossible. So there are no partial elementary symmetric functions in $1$ variable in $I_\mu$.


For subsets $S$ of size $k=2$, we have $d_2(\mu)=1$ since there is one box among the last two columns (columns $3$ and $4$) of $\mu$, and we must have $2-1\lt r\le 2$. So $r$ can only be $2$, and we have the partial elementary symmetric functions $e_2(S)$ for all subsets $S$ of size $2$. This gives us the six polynomials $$x_1x_2,\hspace{0.3cm} x_1x_3,\hspace{0.3cm} x_1x_4,\hspace{0.3cm} x_2x_3,\hspace{0.3cm} x_2x_4,\hspace{0.3cm} x_3x_4.$$

For subsets $S$ of size $k=3$, we have $d_3(\mu)=2$, and so $3-2 \lt r\le 3$. We therefore have $e_2(S)$ and $e_3(S)$ for each such subset $S$ in $I_\mu$, and this gives us the eight additional polynomials $$x_1x_2+x_1x_3+x_2x_3, \hspace{0.5cm}x_1x_2+x_1x_4+x_2x_4,$$ $$x_1x_3+x_1x_4+x_3x_4,\hspace{0.5cm} x_2x_3+x_2x_4+x_3x_4,$$ $$x_1x_2x_3, \hspace{0.4cm} x_1x_2x_4, \hspace{0.4cm} x_1x_3x_4,\hspace{0.4cm} x_2x_3x_4$$

Finally, for $S=\{x_1,x_2,x_3,x_4\}$, we have $d_4(\mu)=4$ and so $4-4\lt r\le 4$. Thus all of the full elementary symmetric functions $e_1,\ldots,e_4$ in the four variables are also relations in $I_{\mu}$. All in all we have
I_{(3,1)}= &(e_2(x_1,x_2), e_2(x_1,x_3),\ldots, e_2(x_3,x_4), \\
& e_2(x_1,x_2,x_3), \ldots, e_2(x_2,x_3,x_4), \\
& e_3(x_1,x_2,x_3), \ldots, e_3(x_2,x_3,x_4), \\
& e_1(x_1,\ldots,x_4), e_2(x_1,\ldots,x_4), e_3(x_1,\ldots,x_4), e_4(x_1,\ldots,x_4))

As two more examples, it’s clear that $R_{(1^n)}=\mathbb{C}[x_1,\ldots,x_n]/(e_1,\ldots,e_n)$ is the ring of coninvariants under the $S_n$ action, and $R_{(n)}=\mathbb{C}$ is the trivial representation. So $R_\mu$ is a generalization of the coinvariant ring, and in fact the graded Frobenius characteristic of $R_\mu$ is the Hall-Littlewood polynomial $\widetilde{H}_\mu(x;q)$.

Where do these relations come from? The rings $R_\mu$ were originally defined as follows. Let $A$ be a nilpotent $n\times n$ matrix over $\mathbb{C}$. Then $A$ has all $0$ eigenvalues, and so it is conjugate to a matrix in Jordan normal form whose Jordan blocks have all $0$’s on the diagonal. The sizes of the Jordan blocks, written in nonincreasing order form a partition $\mu’$, and this partition uniquely determines the conjugacy class of $A$. In other words:

There is exactly one nilpotent conjugacy class $C_{\mu’}$ in the space of $n\times n$ matrices for each partition $\mu’$ of $n$.

The closures of these conjugacy classes $\overline{C_{\mu’}}$ form closed matrix varieties, and their coordinate rings were studied here. However, they are easier to get a handle on after intersecting with the set $T$ of diagonal matrices, leading to an interesting natural question: what is the subvariety of diagonal matrices in the closure of the nilpotent conjugacy class $C_{\mu’}$? Defining $R_\mu=\mathcal{O}(\overline{C_{\mu’}}\cap T)$, we obtain the same modules as above.

Tanisaki found the presentation for $R_\mu$ given above using roughly the following argument. Consider the matrix $A-tI$, where $A\in C_{\mu’}$. Then one can show (see, for instance, the discussion of invariant factors and elementary divisors in the article on Smith normal form on Wikipedia) that the largest power of $t$ dividing all of the $k\times k$ minors of $A-tI$, say $t^{d_k}$, is fixed under conjugation, so we can assume $A$ is in Jordan normal form. Then it’s not hard to see, by analyzing the Jordan blocks, that this power of $t$ is $t^{d_k(\mu)}$ where $\mu$ is the transpose partition of $\mu’$ and $d_k(\mu)$ is defined as above – the sums of the ending columns of $\mu$.

It follows that any element of the closure of $C_\mu$ must also have this property, and so if $X=\mathrm{diag}(x_1,\ldots,x_n)\in \overline{C_\mu}\cap T$ then we have $$t^{d_k(\mu)} | (x_{i_1}-t)(x_{i_2}-t)\cdots (x_{i_k}-t)$$ for any subset $S=\{x_{i_1},\ldots,x_{i_k}\}$ of $\{x_1,\ldots,x_n\}$. Expanding the right hand side as a polynomial in $t$ using Vieta’s formulas, we see that the elementary symmetric functions $e_r(S)$ vanish on $X$ as soon as $r \gt k-d_k(\mu)$, which is exactly the relations that describe $I_\mu$ above.

It takes somewhat more work to prove that these relations generate the entire ideal, but this can be shown by showing that $R_\mu$ has the right dimension, namely the multinomial coefficient $\binom{n}{\mu}$. And for that, we’ll discuss on page 2 the monomial basis of Garsia and Procesi.


Sometimes it’s the missteps in life that lead to the greatest adventures down the road.

For me, my pursuit of a Ph.D. in mathematics, specifically in algebraic combinatorics, might be traced back to my freshman year as an undergraduate at MIT. Coming off of a series of successes in high school math competitions and other science-related endeavors (thanks to my loving and very mathematical family!), I was a confident and excited 18-year old whose dream was to become a physicist and use my mathematical skills to, I don’t know, come up with a unified field theory or something.

Me at the age of 18-ish.

Me at the age of 18-ish.

But I loved pure math too, and a number of my friends were signed up for the undergraduate Algebraic Combinatorics class in the spring, so my young ambitious self added it to my already packed course load. I had no idea what “Algebraic Combinatorics” even meant, but I did hear that it was being taught by Richard Stanley, a world expert in the area. How could I pass up that chance? What if he didn’t teach it again before I left MIT?

On the first day of the class, Stanley started with a simple combinatorial question. It was something like the following: In a complete graph with $n$ vertices, how many walks of length $k$ starting at vertex $v$ end up back at vertex $v$ on the last step? For instance, if $n=5$ and $k=2$, the graph looks like: graph5-noblue and there are four closed walks of length two, from $v$ to any other vertex and back again:


There is an elementary (though messy) way to solve this, but Stanley went forth with an algebraic proof. He considered the adjacency matrix $A$ of the complete graph, and showed that the total number of loops of length $k$ starting from any vertex is the trace of $A^k$. One can then compute this trace using eigenvalues and divide by $n$ to get the number of loops starting at $v$. Beautiful!

I remember sitting in my seat, wide-eyed, watching Richard Stanley quietly but authoritatively discuss the technique. It was incredible to me that advanced tools from linear algebra could be used to so elegantly solve such a simple, concrete problem. To use a term from another area of algebraic combinatorics, I was hooked.

But I was also a freshman, and didn’t yet have a strong grasp of some of the other algebraic concepts being used in the course. I studied hard but wound up with a B+ in the class. Me, get a B+ in a math class? I was horrified, my 18-year-old Little-Miss-Perfect confidence shattered. Now, not only was I fascinated with the subject, I gained respect for it. It was a worthy challenge, and I couldn’t help but come back for more.

In the years that followed, I took more courses on similar subjects and wrote several undergraduate research papers. I dabbled in other areas as well, but was always drawn back to the interplay between combinatorics and algebra. I now find myself, as of Friday, May 20, 2016, having completed my Ph.D. at UC Berkeley on a topic in algebraic combinatorics…

Maria Grad - Web - 19

…and I often wonder how much that silly little B+ motivated me throughout the years.

(See page 2 for a summary of my thesis. My full thesis can be found here.)

What do Schubert curves, Young tableaux, and K-theory have in common? (Part III)

This is the third and final post in our expository series of posts (see Part I and Part II) on the recent paper coauthored by Jake Levinson and myself.

Last time, we discussed the fact that the operator $\omega$ on certain Young tableaux is actually the monodromy operator of a certain covering map from the real locus of the Schubert curve $S$ to $\mathbb{RP}^1$. Now, we’ll show how our improved algorithm for computing $\omega$ can be used to approach some natural questions about the geometry of the curve $S$. For instance, how many (complex) connected components does $S$ have? What is its genus? Is $S$ a smooth (complex) curve?

The genus of $S$

The arithmetic genus of a connected curve $S$ can be defined as $g=1-\chi(\mathcal{O}_S)$ where $$\chi(\mathcal{O}_S)=\dim H^0(\mathcal{O}_S)-\dim H^1(\mathcal{O}_S)$$ is the Euler characteristic and $\mathcal{O}_S$ is the structure sheaf. So, to compute the genus it suffices to compute the Euler characteristic, which can alternatively be defined in terms of the $K$-theory of the Grassmannian.

The $K$-theory ring $K(\mathrm{Gr}(n,k))$

The $K$-theory ring $K(X)$ of a scheme $X$ is defined as follows. First, consider the free abelian group $G$ generated by isomorphism classes of locally free coherent sheaves (a.k.a. vector bundles) on $X$. Then define $K(X)$, as a group, to be the quotient of $G$ by “short exact sequences”, that is, the quotient $G/H$ where $H$ is the subgroup generated by expressions of the form $[\mathcal{E}_1]+[\mathcal{E}_2]-[\mathcal{E}]$ where $0\to \mathcal{E}_1 \to \mathcal{E} \to \mathcal{E}_2 \to 0$ is a short exact sequence of vector bundles on $X$. This gives the additive structure on $K(X)$, and the tensor product operation on vector bundles makes it into a ring. It turns out that, in the case that $X$ is smooth (such as a Grassmannian!) then we get the exact same ring if we remove the “locally free” restriction and consider coherent sheaves modulo short exact sequences.

Where does this construction come from? Well, a simpler example of $K$-theory is the construction of the Grothendieck group of an abelian monoid. Consider an abelian monoid M (recall that a monoid is a set with an associative binary operation and an identity element, like a group without inverses). We can construct an associated group $K(M)$ by taking the quotient free abelian group generated by elements $[m]$ for $m\in M$ by the subgroup generated by expressions of the form $[m]+[n]-[m+n]$. So, for instance, $K(\mathbb{N})=\mathbb{Z}$. In a sense we are groupifying monoids. The natural monoidal operation on vector spaces is $\oplus$, so if $X$ is a point, then all short exact sequences split and the $K$-theory ring $K(X)$ is the Grothendieck ring of this monoid.

A good exposition on the basics of $K$-theory can be found here, and for the $K$-theory of Grassmannians, see Buch’s paper. For now, we’ll just give a brief description of how the $K$-theory of the Grassmannian works, and how it gives us a handle on the Euler characteristic of Schubert curves.

Recall from this post that the CW complex structure given by the Schubert varieties shows that the classes $[\Omega_\lambda]$, where $\lambda$ is a partition fitting inside a $k\times (n-k)$ rectangle, generate the cohomology ring $H^\ast(\mathrm{Gr}(n,k))$. Similarly, the $K$-theory ring is a filtered ring generated by the classes of the coherent sheaves $[\mathcal{O}_{\lambda}]$ where if $\iota$ is the inclusion map $\iota:\Omega_\lambda\to \mathrm{Gr}(n,k)$, then $\mathcal{O}_\lambda=\iota_\ast \mathcal{O}_{\Omega_\lambda}$. Multiplication of these basic classes is given by a variant of the Littlewood-Richardson rule: $$[\mathcal{O}_\lambda]\cdot [\mathcal{O}_\nu]=\sum_\nu (-1)^{|\nu|-|\lambda|-|\mu|}c^\nu_{\lambda\mu}[\mathcal{O}_\nu]$$ where if $|\nu|=|\lambda|+|\mu|$ then $c^{\nu}_{\lambda\mu}$ is the usual Littlewood-Richardson coefficient. If $|\nu|<|\lambda|+|\mu|$ then the coefficient is zero, and if it $|\nu|>|\lambda|+|\mu|$ then $c^{\nu}_{\lambda\mu}$ is a nonnegative integer. We will refer to these nonnegative values as $K$-theory coefficients.

$K$-theory and the Euler characteristic

The $K$-theory ring is especially useful in computing Euler characteristics. It turns out that the Euler characteristic gives an (additive) group homomorphism $\chi:K(X)\to \mathbb{Z}$. To show this, it suffices to show that if $0\to \mathcal{A}\to \mathcal{B}\to \mathcal{C}\to 0$ is a short exact sequence of coherent sheaves on $X$, then $\chi(\mathcal{A})+\chi(\mathcal{C})-\chi(\mathcal{B})=0$. Indeed, such a short exact sequence gives rise to a long exact sequence in cohomology:

& H^0(\mathcal{A}) & \to & H^0(\mathcal{B}) & \to & H^0(\mathcal{C}) \\
\to &H^1(\mathcal{A}) & \to & H^1(\mathcal{B}) & \to & H^1(\mathcal{C}) \\
\to &H^2(\mathcal{A}) & \to & H^2(\mathcal{B}) & \to & H^2(\mathcal{C}) \\
\cdots & & & & &

and the alternating sum of the dimensions of any exact sequence must be zero. Thus we have $$\begin{eqnarray*}0&=&\sum_i (-1)^i\dim H^i(\mathcal{A})-\sum_i (-1)^i\dim H^i(\mathcal{b})+\sum_i (-1)^i\dim H^i(\mathcal{C}) \\ &=&\chi(\mathcal{A})+\chi(\mathcal{C})-\chi(\mathcal{B})\end{eqnarray*}$$ as desired.

Therefore, it makes sense to talk about the Euler characteristic of a class of coherent sheaves in $K(X)$. In fact, in our situation, we have a closed subset $S$ of $X=\mathrm{Gr}(n,k)$, say with inclusion map $j:S\to X$, and so the Euler characteristic of the pushforward $j_\ast\mathcal{O}_S$ is equal to $\chi(\mathcal{O}_S)$ itself. We can now compute the Euler characteristic $\chi(j_\ast\mathcal{O}_S)$ using the structure of the $K$-theory ring of the Grassmannian. Indeed, $S$ is the intersection of Schubert varieties indexed by the three partitions $\alpha$, $\beta$, and $\gamma$ (see Part I). So in the $K$-theory ring, if we identify structure sheaves of closed subvarieties with their pushforwards under inclusion maps, we have $$[\mathcal{O}_S]=[\mathcal{O}_\alpha]\cdot [\mathcal{O}_\beta]\cdot [\mathcal{O}_\gamma].$$ By the $K$-theoretic Littlewood-Richardson rule described above, this product expands as a sum of integer multiples of classes $[\mathcal{O}_\nu]$ where $|\nu|\ge |\alpha|+|\beta|+|\gamma|$. But in our setup we have $|\alpha|+|\beta|+|\gamma|=k(n-k)-1$, so $\nu$ is either the entire $k\times (n-k)$ rectangle (call this $\rho$) or it is the rectangle minus a single box (call this $\rho’$). In other words, we have:

$$[\mathcal{O}_S]=c^{\rho’}_{\alpha,\beta,\gamma}[\mathcal{O}_{\rho’}]-k[\mathcal{O}_{\rho}]$$ where $k$ is an integer determined by the $K$-theory coefficients. Notice that $c^{\rho’}_{\alpha,\beta,\gamma}$ is the usual Littlewood-Richardson coefficient, and counts exactly the size of the fibers (the set $\omega$ acts on) in our map from Part II. Let’s call this number $N$.

Finally, notice that $\Omega_\rho$ and $\Omega_{\rho’}$ are a point and a copy of $\mathbb{P}^1$ respectively, and so both have Euler characteristic $1$. It follows that $$\chi(\mathcal{O}_S)=N-k.$$
Going back to the genus, we see that if $S$ is connected, we have $g=1-\chi(\mathcal{O}_S)=k-(N-1)$.

Computing $k$ in terms of $\omega$

The fascinating thing about our algorithm for $\omega$ is that certain steps of the algorithm combinatorially correspond to certain tableaux that enumerate the $K$-theory coefficients, giving us information about the genus of $S$. These tableaux are called “genomic tableaux”, and were first introduced by Pechenik and Yong.

In our case, the genomic tableaux that enumerate $k$ can be defined as follows. The data of a tableau $T$ and two marked squares $\square_1$ and $\square_2$ in $T$ is a genomic tableau if:

  1. The marked squares are non-adjacent and contain the same entry $i$,
  2. There are no $i$’s between $\square_1$ and $\square_2$ in the reading word of $T$,
  3. If we delete either $\square_1$ or $\square_2$, every suffix of the resulting reading word is ballot (has more $j$’s than $j+1$’s for all $j$).

For instance, consider the following tableaux with two marked (shaded) squares:


Property 1 means that the fourth tableau is not genomic: the marked squares, while they do contain the same entry, are adjacent squares. The first tableau above violates Property 2, because there is a $2$ between the two marked $2$’s in reading order. Finally, the second tableau above violates Property 3, because if we delete the top marked $1$ then the suffix $221$ is not ballot. The third tableau above satisfies all three properties, and so it is genomic.

Finally, consider the algorithm for $\omega$ described in Part I. Jake and I discovered that the steps in Phase 1 in which the special box does not move to an adjacent square are in bijective correspondence with the $K$-theory tableau of the total skew shape $\gamma^c/\alpha$ and content $\beta$ (where the marked squares only add $1$ to the total number of $i$’s). The correspondence is formed by simply filling the starting and ending positions of the special box with the entry $i$ that it moved past, and making these the marked squares of a genomic tableau. In other words:

The $K$-theory coefficient $k$ is equal to the number of non-adjacent moves in all Phase 1’s of the local algorithm for $\omega$.

Geometric consequences

This connection allows us to get a handle on the geometry of the Schubert curves $S$ using our new algorithm. As one illuminating example, let’s consider the case when $\omega$ is the identity permutation.

It turns out that the only way for $\omega$ to map a tableau back to itself is if Phase 1 consists of all vertical slides and Phase 2 is all horizontal slides; then the final shuffle step simply reverses these moves. This means that we have no non-adjacent moves, and so $k=0$ in this case. Since $\omega$, the monodromy operator on the real locus, is the identity, we also know that the number of real connected components is equal to $N$, which is an upper bound on the number of complex connected components (see here), which in turn is an upper bound on the Euler characteristic $\chi(\mathcal{O}_S)=\dim H^0(\mathcal{O}_S)-\dim H^1(\mathcal{O}_S)$. But the Euler characteristic is equal to $N$ in this case, and so there must be $N$ complex connected components, one for each of the real connected components. It follows that $\dim H^1(\mathcal{O}_S)=0$, so the arithmetic genus of each of these components is zero.

We also know each of these components is integral, and so they must each be isomorphic to $\mathbb{CP}^1$ (see Hartshorne, section 4.1 exercise 1.8). We have therefore determined the entire structure of the complex Schubert curve $S$ in the case that $\omega$ is the identity map, using the connection with $K$-theory described above.

Similar analyses lead to other geometric results: we have also shown that the Schubert curves $S$ can have arbitrarily high genus, and can arbitrarily many complex connected components, for various values of $\alpha$, $\beta$, and $\gamma$.

So, what do Schubert curves, Young tableaux, and $K$-theory have in common? A little monodromy operator called $\omega$.

What do Schubert curves, Young tableaux, and K-theory have in common? (Part II)

In the last post, we discussed an operation $\newcommand{\box}{\square}
\omega$ on skew Littlewood-Richardson Young tableaux with one marked inner corner, defined as a commutator of rectification and shuffling. As a continuation, we’ll now discuss where this operator arises in geometry.

Schubert curves: Relaxing a Restriction

Recall from our post on Schubert calculus that we can use Schubert varieties to answer the question:

Given four lines $\ell_1,\ell_2,\ell_3,\ell_4\in \mathbb{C}\mathbb{P}^3$, how many lines intersect all four of these lines in a point?

In particular, given a complete flag, i.e. a chain of subspaces $$0=F_0\subset F_1\subset\cdots \subset F_m=\mathbb{C}^m$$ where each $F_i$ has dimension $i$, the Schubert variety of a partition $\lambda$ with respect to this flag is a subvariety of the Grassmannian $\mathrm{Gr}^n(\mathbb{C}^m)$ defined as
$$\Omega_{\lambda}(F_\bullet)=\{V\in \mathrm{Gr}^n(\mathbb{C}^m)\mid \dim V\cap F_{n+i-\lambda_i}\ge i.\}$$

Here $\lambda$ must fit in a $k\times (n-k)$ box in order for the Schubert variety to be nonempty. In the case of the question above, we can translate the question into an intersection problem in $\mathrm{Gr}^2(\mathbb{C}^4)$ with four general two-dimensional subspaces $P_1,P_2,P_3,P_4\subset \mathbb{C}^4$, and construct complete flags $F_\bullet^{(1)},F_\bullet^{(2)},F_\bullet^{(3)},F_\bullet^{(4)}$ such that their second subspace $F^{(i)}_2$ is $P_i$ for each $i=1,2,3,4$. Then the intersection condition is the problem of finding a plane that intersects all $P_i$’s in a line. The variety of all planes intersecting $P_i$ in a line is $\Omega_\box(F_\bullet^{(i)})$ for each $i$, and so the set of all solutions is the intersection $$\Omega_\box(F_\bullet^{(1)})\cap \Omega_\box(F_\bullet^{(2)})\cap \Omega_\box(F_\bullet^{(3)})\cap \Omega_\box(F_\bullet^{(4)}).$$ And, as discussed in our post on Schubert calculus, since the $k\times(n-k)$ box has size $4$ in $\mathrm{Gr}^2(\mathbb{C}^4)$ and the four partitions involved have sizes summing to $4$, this intersection has dimension $0$. The Littlewood-Richardson rule then tells us that the number of points in this zero-dimensional intersection is the Littlewood-Richardson coefficient $c_{\box,\box,\box,\box}^{(2,2)}$.

What happens if we relax one of the conditions in the problem, so that we are only intersecting three of the Schubert varieties above? In this case we get a one-parameter family of solutions, which we call a Schubert curve.

To define a Schubert curve in general, we require a sufficiently “generic” choice of $r$ flags $F_\bullet^{(1)},\ldots, F_\bullet^{(r)}$ and a list of $r$ partitions $\lambda^{(1)},\ldots,\lambda^{(r)}$ (fitting inside the $k\times (n-k)$ box) whose sizes sum to $k(n-k)-1$. It turns out that one can choose any flags $F_\bullet$ defined by the iterated derivatives at chosen points on the rational normal curve, defined by the locus of points of the form $$(1:t:t^2:t^3:\cdots:t^{n-1})\in \mathbb{CP}^n$$ (along with the limiting point $(0:0:\cdots:1)$.) In particular, consider the flag whose $k$th subspace $F_k$ is the span of the first $k$ rows of the matrix of iterated derivatives at the point on this curve parameterized by $t$:
1 & t & t^2 & t^3 & \cdots & t^{n-1} \\
0 & 1 & 2t & 3t^2 & \cdots & (n-1) t^{n-2} \\
0 & 0 & 2 & 6t & \cdots & \frac{(n-1)!}{(n-3)!} t^{n-3} \\
0 & 0 & 0 & 6 & \cdots & \frac{(n-1)!}{(n-4)!} t^{n-3} \\
\vdots & \vdots & \vdots & \vdots &\ddots & \vdots \\
0 & 0 & 0 & 0 & \cdots & (n-1)!

This is called the osculating flag at the point $t$, and if we pick a number of points on the curve and choose their osculating flag, it turns out that they are in sufficiently general position in order for the Schubert intersections to have the expected dimension. So, we pick some number $r$ of these osculating flags, and choose exactly $r$ partitions $\lambda^{(1)},\ldots,\lambda^{(r)}$ with sizes summing to $k(n-k)-1$. Intersecting the resulting Schubert varieties defines a Schubert curve $S$.

A covering map

In order to get a handle on these curves, we consider the intersection of $S$ with another Schubert variety of a single box, $\Omega_\box(F_\bullet)$. In particular, after choosing our $r$ osculating flags, choose an $(r+1)$st point $t$ on the rational normal curve and choose the single-box partition $\lambda=(1)$. Intersecting the resulting Schubert variety $\Omega_\box(F_\bullet)$ with our Schubert curve $S$ gives us a zero-dimensional intersection, with the number of points given by the Littlewood-Richardson coefficient $c:=c^{B}_{\lambda^{(1)},\ldots,\lambda^{(r)},\box}$ where $B=((n-k)^k)$ is the $k\times (n-k)$ box partition.

By varying the choice of $t$, we obtain a partition of an open subset of the Schubert curve into sets of $c$ points. We can then define a map from this open subset of $S$ to the points of $\mathbb{CP}^1$ for which the preimage of $(1:t)$ consists of the $c$ points in the intersection given by choosing the $(r+1)$st point to be $(1:t:t^2:t^3:\cdots:t^{n-1})$.

In this paper written by my coauthor Jake Levinson, it is shown that if we choose all $r+1$ points to be real, then this can be extended to a map $S\to \mathbb{CP}^1$ for which the real locus $S(\mathbb{R})$ is a smooth, finite covering of $\mathbb{RP}^1$. The fibers of this map have size $c=c^{B}_{\lambda^{(1)},\ldots,\lambda^{(r)},\box}$. Note that this Littlewood-Richardson coefficient counts the number of ways of filling the box $B$ with skew Littlewood-Richardson tableaux with contents $\lambda^{(1)},\ldots,\lambda^{(r)},\box$ such that each skew shape extends the previous shape outwards. In addition, by the symmetry of Littlewood-Richardson coefficients, it doesn’t matter what order in which we use the partitions. It turns out that there is a canonical way of labeling the fibers by these chains of tableaux, for which the monodromy of the covering is described by shuffling operators.

Let’s be more precise, and consider the simplest interesting case, $r=3$. We have three partitions $\alpha$, $\beta$, and $\gamma$ such that $$|\alpha|+|\beta|+|\gamma|=k(n-k)-1=|B|-1.$$ Let’s choose, for simplicity, the three points $0$, $1$, and $\infty$ to define the respective osculating flags. Then we can label the points in the fiber $f^{-1}(0)$ of the map $f:S\to \mathbb{RP}^1$ by the fillings of the box with a chain of Littlewood-Richardson tableaux of contents $\alpha$, $\box$, $\beta$, and $\gamma$ in that order. Note that there is only one Littlewood-Richardson tableau of straight shape and content $\alpha$, and similarly for the anti-straight shape $\gamma$, so the data here consists of a skew Littlewood-Richardson tableau of content $\beta$ with an inner corner chosen to be the marked box. This is the same object that we were considering in Part I.

Now, we can also label the fiber $f^{-1}(1)$ by the chains of contents $\box$, $\alpha$, $\beta$, and $\gamma$ in that order, and finally label the fiber $f^{-1}(\infty)$ by the chains of contents $\alpha$, $\beta$, $\box$, and $\gamma$ in that order, in such a way that if we follow the curve $S$ along an arc from a point in $f^{-1}(0)$ to $f^{-1}(1)$ or similarly from $1$ to $\infty$ or $\infty$ to $0$, then then the map between the fibers is given by the shuffling operations described in the last post! In particular if we follow the arc from $0$ to $\infty$ that passes through $1$, the corresponding operation on the fibers is given by the “evacuation-shuffle”, or the first three steps of the operator $\omega$ described in the previous post. The arc from $\infty$ back to $0$ on the other side is given by the “shuffle” of the $\box$ past $\beta$, which is the last step of $\omega$. All in all, $\omega$ gives us the monodromy operator on the zero fiber $f^{-1}(0)$.

The following picture sums this up:


So, the real geometry of the Schubert curve boils down to an understanding of the permutation $\omega$. Our local algorithm allows us to get a better handle on the orbits of $\omega$, and hence tell us things about the number of real connected components, the lengths of these orbits, and even in some cases the geometry of the complex curve $S$.

Next time, I’ll discuss some of these consequences, as well as some fascinating connections to the $K$-theory of the Grassmannian. Stay tuned!

The Springer Correspondence, Part III: Hall-Littlewood Polynomials

This is the third post in a series on the Springer correspondence. See Part I and Part II for background.

In this post, we’ll restrict ourselves to the type A setting, in which $\DeclareMathOperator{\GL}{GL}\DeclareMathOperator{\inv}{inv} G=\GL_n(\mathbb{C})$, the Borel $B$ is the subgroup of invertible upper triangular matrices, and $U\subset G$ is the unipotent subvariety. In this setting, the flag variety is isomorphic to $G/B$ or $\mathcal{B}$ where $\mathcal{B}$ is the set of all subgroups conjugate to $B$.

The Hall-Littlewood polynomials

For a given partition $\mu$, the Springer fiber $\mathcal{B}_\mu$ can be thought of as the set of all flags $F$ which are fixed by left multiplication by a unipotent element $u$ of Jordan type $\mu$. In other words, it is the set of complete flags $$F:0=F_0\subset F_1 \subset F_2 \subset \cdots \subset F_n=\mathbb{C}^n$$ where $\dim F_i=i$ and $uF_i=F_i$ for all $i$.

In the last post we saw that there is an action of the Weyl group, in this case the symmetric group $S_n$, on the cohomology rings $H^\ast(\mathcal{B}_\mu)$ of the Springer fibers. We let $R_\mu=H^\ast(\mathcal{B}_\mu)$ denote this ring, and we note that its graded Frobenius characteristic $$\DeclareMathOperator{\Frob}{Frob}\widetilde{H}_\mu(X;t):=\Frob_t(H^\ast(\mathcal{B}_\mu))=\sum_{d\ge 0}t^d \Frob(H^{2d}(\mathcal{B}_\mu))$$ encodes all of the data determining this ring as a graded $S_n$-module. The symmetric functions $\widetilde{H}_\mu(X,t)\in \Lambda_{\mathbb{Q}(t)}(x_1,x_2,\ldots)$ are called the Hall-Littlewood polynomials.

The first thing we might ask about a Hall-Littlewood polynomial $H_\mu$ is: what is its degree as a polynomial in $t$? In other words…

What is the dimension of $\mathcal{B}_\mu$?

The dimension of $\mathcal{B}_\mu$ will tell us the highest possible degree of its cohomology ring, giving us at least an upper bound on the degree of $H_\mu$. To compute the dimension, we will decompose $\mathcal{B}_\mu$ into a disjoint union of subvarieties whose dimensions are easier to compute.

Let’s start with a simple example. If $\mu=(1,1,1,\ldots,1)$ is a single-column shape of size $n$, then $\mathcal{B}_\mu$ is the full flag variety $\mathcal{B}$, since here the unipotent element $1$ is in the conjugacy class of shape $\mu$, and we can interpret $\mathcal{B}_\mu$ as the set of flags fixed by the identity matrix (all flags). As described in Part I, the flag variety can be decomposed into Schubert cells $X_w$ where $w$ ranges over all permutations in $S_n$ and $\dim(X_w)=\inv(w)$. For instance, $X_{45132}$ is the set of flags defined by the initial row spans of a matrix of the form:
0 & 1 & \ast & \ast & \ast \\
1 & 0 & \ast & \ast & \ast \\
0 & 0 & 0 & 0 & 1 \\
0 & 0 & 1 & \ast & 0 \\
0 & 0 & 0 & 1 & 0 \end{array}\right)$$
because this matrix has its leftmost $1$’s in positions $4,5,1,3,2$ from the right in that order.

Thus the dimension of the flag variety is the maximum of the dimensions of these cells. The number of inversions in $w$ is maximized when $w=w_0=n(n-1)\cdots 2 1$, and so $$\dim(\mathcal{B})=\inv(w_0)=\binom{n}{2}.$$

We claim that in general, $\dim(\mathcal{B}_\mu)=n(\mu)$ where if $\mu^\ast$ denotes the conjugate partition, $n(\mu)=\sum \binom{\mu^\ast_i}{2}$. Another way of defining $n(\mu)$ is as the sum of the entries of the superstandard tableau formed by filling the bottom row of $\mu$ with $0$’s, the next row with $1$’s, and so on:


To show this, notice that since $\mathcal{B}_\mu$ is a subvariety of the full flag variety $\mathcal{B}$, and so $$\mathcal{B}_\mu=\mathcal{B}_\mu\cap \mathcal{B}=\bigsqcup \mathcal{B}_\mu\cap X_{w}.$$ It thus suffices to find the largest possible dimension of the varieties $\mathcal{B}_\mu\cap X_{w}$.

Let $u$ be the standard unipotent element of Jordan type $\mu$. For instance, the matrix below is the standard unipotent matrix of shape $(3,2,2)$.
Then the set $\mathcal{B}_\mu\cap X_{w}$ can be defined as the subset of $n\times n$ matrices defining flags in $X_w$ whose partial row spans are fixed by the action of $u$. Note that since the first vector is fixed by $u$, it must be equal to a linear combination of the unit vectors $e_{\mu_1}, e_{\mu_1+\mu_2},\ldots$. So we instantly see that the dimension of $\mathcal{B}_\mu\cap X_w$ is in general less than that of $X_w$.

Now, consider the permutation $$\hat{w}=n,n-\mu_1,n-\mu_1-\mu_2,\ldots,n-1,n-1-\mu_1,n-1-\mu_1-\mu_2,\ldots,\ldots.$$ Then it is not hard to see that the matrices in $X_{\hat{w}}$ whose flags are fixed by $u$ are those with $1$’s in positions according to $\hat{w}$, and with $0$’s in all other positions besides those in row $i$ from the top and column $k-\mu_1-\cdots-\mu_j$ from the right for some $i,j,k$ satisfying $i\le j$ and $\mu^\ast_1+\cdots+\mu^\ast_k< i \le \mu^\ast_1+\cdots+\mu^\ast_{k+1}$. This is a mouthful which is probably better expressed via an example. If $\mu=(3,2,2)$ as above, then $\hat{w}=7426315$, and $\mathcal{B}_\mu\cap X_{7426315}$ is the set of flags determined by the rows of matrices of the form $$\left(\begin{array}{ccccccc} 1 & 0 & 0 & \ast & 0 & \ast & 0 \\ 0 & 0 & 0 & 1 & 0 & \ast & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 & \ast & 0 & \ast \\ 0 & 0 & 0 & 0 & 1 & 0 & \ast \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ \end{array}\right)$$ The first three rows above correspond to the first column of $\mu$, the next three rows to the second column, and the final row to the last column. Notice that the stars in each such block of rows form a triangular pattern similar to that for $X_{w_0}$, and therefore there are $n(\mu)=\binom{\mu^\ast_1}{2}+\binom{\mu^\ast_2}{2}+\cdots$ stars in the diagram. Thus $\mathcal{B}_\mu\cap X_{\hat{w}}$ an open affine set of dimension $n(\mu)$, and so $\mathcal{B}_\mu$ has dimension at least $n(\mu)$. A bit more fiddling with linear algebra and multiplication by $u$ (try it!) shows that, in fact, for any permutation $w$, the row with a $1$ in the $i$th position contributes at most as many stars in $\mathcal{B}_\mu\cap X_{w}$ as it does in $\mathcal{B}_\mu\cap X_{\hat{w}}$. In other words, all other components $\mathcal{B}_\mu\cap X_{w}$ have dimension at most $n(\mu)$, and so $$\dim\mathcal{B}_\mu=n(\mu).$$

The orthogonality relations

In Lusztig’s survey on character sheaves, he shows that the Hall-Littlewood polynomials (and similar functions for other Lie types) satisfy certain orthogonality and triangularity conditions that determine them completely. To state them in the type A case, we first define $\widetilde{H}_\mu[(1-t)X;t]$ to be the result of plugging in the monomials $x_1,-tx_1,x_2,-tx_2,\ldots$ for $x_1,x_2,\ldots$ in the Hall-Littlewood polynomials. (This is a special kind of plethystic substitution.) Then Lusztig’s work shows that:

  • $\left\langle \widetilde{H}_\mu(X;t),s_\lambda\right\rangle=0$ for any $\lambda<\mu$ in dominance order, and $\langle\widetilde{H}_\mu,s_\mu\rangle=1$
  • $\left\langle \widetilde{H}_\mu[(1-t)X;t],s_\lambda\right\rangle=0$ for any $\lambda>\mu$ in dominance order
  • $\left\langle \widetilde{H}_\mu(X;t),\widetilde{H}_{\lambda}[(1-t)X;t]\right\rangle=0$ whenever $\lambda\neq \mu$.

In all three of the above, the inner product $\langle,\rangle$ is the Hall inner product, which can be defined as the unique inner product for which $$\langle s_\lambda,s_\mu\rangle = \delta_{\lambda\mu}$$ for all $\lambda$ and $\mu$.

Since the Schur functions $s_\lambda$ correspond to the irreducible representations $V_\lambda$ of $S_n$, we can therefore interpret these orthogonality conditions in a representation theoretic manner. The inner product $\left \langle \widetilde{H}_\mu(X;t),s_\lambda \right\rangle$ is the coefficient of $s_\lambda$ in the Schur expansion of $\widetilde{H}_\mu$, and is therefore the Hilbert series of the isotypic component of type $V_\lambda$ in the cohomology ring $R_\mu=H^\ast(\mathcal{B}_\mu)$. Moreover, the seemingly arbitrary substitution $X\mapsto (1-t)X$ actually corresponds to taking tensor products with the exterior powers of the permutation representation $V$ of $S_n$. To be precise: $$\widetilde{H}_\mu[(1-t)X;t]=\sum_{i\ge 0} (-1)^i t^i \Frob_t(R_\mu\otimes \Lambda^i(V)).$$

It turns out that any two of the three conditions above uniquely determine the Hall-Littlewood polynomials, and in fact can be used to calculate them explicitly. On the next page, we will work out an example using the first and third conditions above.