Counting ballots with crystals

In my graduate Advanced Combinatorics class last semester, I covered the combinatorics of crystal base theory. One of the concepts that came up in this context was ballot sequences, which are motivated by the following elementary problem about voting:

Suppose two candidates, A and B, are running for local office. There are 100 voters in the town, 50 of whom plan to vote for candidate A and 50 of whom plan to vote for candidate B. The 100 voters line up in a random order at the voting booth and cast their ballots one at a time, and the votes are counted real-time as they come in with the tally displayed for all to see. What is the probability that B is never ahead of A in the tally?

We’ll provide a solution to this classical problem on page 2 of this post. For now, this motivates the notion of a ballot sequence in two letters, which is a sequence of A’s and B’s such that, as the word is read from left to right, the number of A’s that have been read so far is always at least as large as the number of B’s.

For instance, the sequence AABABB is ballot, because as we read from left to right we get the words A, AA, AAB, AABA, AABAB, and AABABB, each of which has at least as many A’s as B’s. On the other hand, the sequence ABBAAB is not, because after reading the first three letters ABB, there are more B’s than A’s.

If we replace the A’s by $1$’s and $B$’s by $2$’s and reverse the words, we obtain the notion of a ballot sequence in $1$’s and $2$’s described in our previous post on crystals. In particular, we say a sequence of $1$’s and $2$’s is ballot if, when we read the word from right to left, there are at least as many $1$’s as $2$’s at each step. So $221211$ and $211111$ are both ballot, but $111112$ and $211221$ are not.

Enumerating all ballot sequences

When I introduced this notion in class, one of my students asked the following.

How many total ballot sequences of $1$’s and $2$’s are there of length $n$?

Now, as in the first question about voting above, the more common version of this type of question is to fix the number of $1$’s and $2$’s in the sequence (the “content” of the word) and ask how many ballot sequences have exactly that many $1$’s and $2$’s. But in this case, the question was asked with no fixed content, resulting in a sum of Littlewood-Richardson coefficients (or, in voting terms, where the voters have not yet decided who they will vote for when they line up, and may vote for either candidate).

To start, let’s try some examples. For $n=0$, there is only one ballot sequence, namely the empty sequence. For $n=1$, there is also just one: $1$. For $n=2$, there are two: $11$ and $21$. For $n=3$, there are three: $111$, $121$, $211$. For $n=4$, there are six: $1111$, $2111$, $1211$, $1121$, $2211$, $2121$. And for $n=5$, there are ten: $$11111, 21111, 12111, 11211, 11121, 22111, 21211, 21121, 12211, 12121$$

The sequence of answers, $1,1,2,3,6,10,\ldots$, so far agrees with the “middle elements” of the rows of Pascal’s triangle:

& & & & & \color{red}1 & & & & & \\
&&&&\color{red} 1&&1 &&&& \\
&&&1&&\color{red} 2&&1&&& \\
&&1&&\color{red} 3&&3&&1&& \\
&1&&4&&\color{red} 6&&4&&1& \\

More formally, it appears that the number of ballot sequences of $1$’s and $2$’s of length $2n$ is $\binom{2n}{n}$, and the number of length $2n+1$ is $\binom{2n+1}{n}$.

Now, it is possible to prove this formula holds using a somewhat complicated recursive argument, which we will also illustrate on page 2 of this post. But there is also very elegant solution using crystal operators.

Solution using crystals

Let’s recall the definition of the crystal operator $F_1$ on words of $1$’s and $2$’s. Given such a word, we first replace all $2$’s with left parentheses, “$($”, and all $1$’s with right parentheses, “$)$”. We then “cancel” left and right parentheses in matching pairs as shown in the following example.

2 & 2 & 1 & 1 & 1 & 1 & 2 & 1 & 2 & 2 & 1 \\
( & ( & ) & ) & ) & ) & ( & ) & ( & ( & ) \\
( & & & ) & ) & ) & & & ( & & \\
& & & & ) & ) & & & ( & &

Once all matching pairs have been cancelled, we are left with a subsequence of the form $$)))\cdots))(((\cdots(($$ consisting of some number of right parentheses (possibly zero) followed by some number of left parentheses (possibly zero). If there is a $)$ remaining, then $F_1$ changes the rightmost $)$ that was not cancelled to $($, changing that $1$ to $2$ in the original word. The word therefore becomes:

2 & 2 & 1 & 1 & 1 & 2 & 2 & 1 & 2 & 2 & 1.

Thus $F_1(22111121221)=22111221221$. If there were no $)$ symbols remaining after cancelling, the operator $F_1$ is undefined.

Now, consider the directed graph on all words of $1$’s and $2$’s of length $n$, where we draw an arrow from word $w$ to word $v$ if $F_1(w)=v$. Here is the graph for $n=4$:

This graph will in general be a union of disjoint one-directional chains, since when $F_1$ is defined it is invertible: the unbracketed $1$ that is changed to a $2$ is still unbracketed, and we can identify it as the leftmost unbracketed $2$ in the new word. We write $E_1$ to denote this inverse operator, which changes the leftmost unpaired $2$ to a $1$ if it exists, and is undefined otherwise.

We also cannot have cycles in the $F_1$ graph, because the number of $1$’s always decreases with every application of $F_1$. Thus we have chains of arrows going forward until we reach an element $w$ for which $F_1(w)$ is undefined. Similarly, going backwards along the $F_1$ arrows, we can continue until $E_1$ is undefined, and we call these top elements of each chain the highest weight words.

There are six highest weight words in the above diagram: $1111$, $2111$, $1211$, $1121$, $2211$, $2121$. Notice that these are precisely the two-letter ballot sequences of length $6$!

Indeed, if a word is ballot, then every $2$ as a left parentheses will be cancelled with some $1$ as a right parentheses to its right, so $E_1$ is undefined on such a word. Conversely, if a word is not ballot, consider the first step in the right-to-left reading of the word that has more $2$’s than $1$’s. The $2$ that is encountered at that step cannot be bracketed with a $1$ to its right, because there are not enough $1$’s to bracket with the $2$’s in that suffix. Thus a word is ballot if and only if $E_1$ is undefined, which means that it is at the top of its chain, or highest weight.

Since there is exactly one highest weight word per chain, we have the following.

The number of ballot sequences of length $n$ is equal to the number of chains in the $F_1$ crystal graph on all $2^n$ words of $1$’s and $2$’s of length $n$.

So, to count the ballot words, it suffices to count the chains of the $F_1$ graph. And here’s the key idea: instead of counting the top elements, count the middle ones!

In the picture above, the middle elements of each chain are: $$1122, 2112, 1212, 1221, 2211, 2121$$ which is just the set of all words having exactly two $1$’s and two $2$’s, and is clearly counted by $\binom{4}{2}$. Why does this work in general?

Here’s where we need one more fact about the $F_1$ chains: they are “content-symmetric”. If the top element of a chain has $k$ ones and $n-k$ twos, then the bottom element has $n-k$ ones and $k$ twos. This is because the top element, after pairing off an equal number of $2$’s and $1$’s by matching parentheses, has a certain number of unpaired $1$’s, which then all get changed to $2$’s one step at a time as we move towards the bottom of the chain. In particular, the middle element of each chain has exactly as many $1$’s as $2$’s (or, if $n$ is odd, the two “middle elements” have one more $1$ than $2$ and one less $1$ than $2$ respectively.)

Finally, since the graph is drawn on all $2^n$ possible words, every word having the same number of $1$’s as $2$’s (or off by $1$ in the odd case) occurs in exactly one chain. It follows that there is a bijection between the chains and these words, which are enumerated by $\binom{2n}{n}$ for words of length $2n$, and $\binom{2n+1}{n}$ for words of length $2n+1$.

For the more elementary approach, and the solution to the classical ballot problem, turn to the next page!

A linear algebra-free proof of the Matrix-Tree Theorem

As a new assistant professor at Colorado State University, I had the privilege this fall of teaching Math 501, the introductory graduate level course in combinatorics. We encountered many ‘mathematical gemstones’ in the course, and one of my favorites is the Matrix-Tree theorem, which gives a determinantal formula for the number of spanning trees in a graph. In particular, there is a version for directed graphs that can be stated as follows.

Consider a directed graph $D=(V,E)$, consisting of a finite vertex set $V=\{v_1,\ldots,v_n\}$ and a set of directed edges $E\subseteq V\times V$. An oriented spanning tree of $D$ is a subset $T\subset E$ of the edges, along with a chosen root vertex $v_k$, such that there is a unique path in $T$ from any vertex $v_j\in V$ to the root $v_k$. Such a tree is said to be oriented towards $v_k$, since all the edges are `pointing towards’ the root. The term spanning indicates that $T$ is incident to every vertex in $V$. For example, in the digraph $D$ at left below, an oriented spanning tree rooted at $v_9$ is shown using red edges in the graph at right.

Define $\tau(D,v_k)$ to be the number of oriented spanning trees of $D$ rooted at $v_k$. One can check that, in the above graph, we have $\tau(D,v_9)=16$. Now, let $m_{i,j}$ be the number of directed edges from $v_i$ to $v_j$ in $D$, so that $m_{i,j}$ is equal to $1$ if $(v_i,v_j)$ is an edge and $0$ otherwise. Define the Laplacian of the digraph $D$ to be the matrix $$L(D)=\left(\begin{array}{ccccc} \mathrm{out}(v_1) & -m_{1,2} & -m_{1,3} & \cdots & -m_{1,n} \\ -m_{2,1} & \mathrm{out}(v_2) & -m_{2,3} & \cdots & -m_{2,n} \\ -m_{3,1} & -m_{3,2} & \mathrm{out}(v_3) & \cdots & -m_{3,n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ -m_{n,1} & m_{n,2} & -m_{n,3} & \cdots & \mathrm{out}(v_n) \end{array}\right)$$ where $\mathrm{out}(v_i)$ is the outdegree of $v_i$, the number of non-loop edges having starting vertex $v_i$ (that is, the number of edges from $v_i$ to a vertex other than $v_i$). Then the (directed) Matrix-Tree theorem states that $$\tau(D,v_k)=\det(L_0(D,k))$$ where $L_0(D,k)$ is the deleted Laplacian obtained by deleting the $k$th row and column from $L(D)$. For instance, in the above graph, we have $$\det L_0(D,9)=\det \left(\begin{array}{cccccccc} 2 & -2 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & -1 & -1 & 0 & 0 & 0 & 0 & 0 \\ -1 & 0 & 2 & 0 & 0 & -1 & 0 & 0 \\ 0 & -1 & 0 & 2 & -1 & 0 & 0 & 0 \\ 0 & 0 & -1 & 0 & 2 & 0 & -1 & 0 \\ 0 & 0 & 0 & 0 & -1 & 2 & 0 & -1 \\ 0 & 0 & 0 & 0 & -1 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{array}\right)=16$$ There are several known proofs of the Matrix-Tree theorem. One of the more `standard’ proofs is by induction on the number of edges in the digraph, combined with a bit of linear algebra and row reduction. But it got me thinking:

Is there be a way to prove that the determinant formula holds directly, without relying on induction or linear algebra?

In particular, the determinant of a matrix $A=(a_{ij})$ can be defined explicitly as $$\det(A)=\sum_{\pi\in S_n} \mathrm{sgn}(\pi)\prod_{i} a_{i\pi(i)}$$ where $\pi:\{1,2,\ldots,n\}\to \{1,2,\ldots,n\}$ ranges over all permutations (bijections) in the symmetric group $S_n$. For instance, \begin{align*} \det\left(\begin{array}{ccc} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{array}\right)&=a_{11}a_{22}a_{33}-a_{12}a_{21}a_{33}-a_{11}a_{23}a_{32} \\ &\phantom{=}+a_{12}a_{23}a_{31}+a_{13}a_{21}a_{32}-a_{13}a_{22}a_{31}. \end{align*} It is natural to ask whether applying this formula to the deleted Laplacian gives any combinatorial insight into why the Matrix-Tree theorem should hold. And indeed, there is a direct proof using this combinatorial definition of the determinant!

A combinatorial proof

For simplicity we set $k=n$, so that we are deleting the $n$th row and column to create the deleted Laplacian $\det(L_0(D,n))$. It is sufficient to consider this case since we can always relabel the vertices to have the deleted vertex be the $n$th.

We now give a combinatorial interpretation of each of the terms of the determinant $\det(L_0(D,n)$ as a sum over permutations of $\{1,2,\ldots,n-1\}$. The term corresponding to the identity permutation is the product of the diagonal entries of $L_0(D,n)$, which is $$\prod_{i\neq n} \mathrm{out}(v_i).$$ This counts the number of ways of choosing a non-loop edge starting at each vertex $v_i\neq v_n$; we call such a choice an out-edge subgraph $G$ of $D$. Note that all oriented spanning trees with root $v_n$ are out-edge subgraphs, but in general an out-edge subgraph may have cycles among the vertices other than $v_n$. In fact, it is not hard to see that every out-edge subgraph consists a number of nontrivial directed cycles among non-$v_n$ vertices, along with a unique directed path from every other vertex into either one of the cycles or into $v_n$. Two examples of out-edge subgraphs which are not trees are shown below.

Now, for a general term corresponding to a permutation $\pi$ of $\{1,2,\ldots,n-1\}$, consider the decomposition of $\pi$ into disjoint cycles. Suppose there are $p$ fixed points and $r$ nontrivial cycles; let $a_1,\ldots,a_p$ be the fixed points of $\pi$ and $(a_{1}^{(j)}\cdots a_{c_j}^{(j)})$ are the other cycles of lengths $c_1,\ldots,c_r$. Then the sign of $\pi$ is $$\mathrm{sgn}(\pi)=(-1)^{(c_1-1)+\cdots+(c_r-1)}=(-1)^{(n-1-p)-r}.$$ The entries multiplied together in the term corresponding to $\pi$ are the outdegrees of $v_{a_1},\ldots, v_{a_p}$ along with the values $-m_{a_{t}^{(i)},a_{t+1}^{(i)}}$. Their product is $(-1)^{n-1-p}$ times the number of ways to choose an edge from $v_{a_t^{(i)}}$ to $v_{a_{t+1}^{(i)}}$ for each $i$ and $t$. Putting this all together, the entire term of the determinant corresponding to $\pi$ is $(-1)^{r}$ times the number of subgraphs formed by choosing a cyclic path on the vertices corresponding to each nontrivial cycle in $\pi$, as well as an out edge for each fixed point. Such a choice is an out-edge subgraph that is compatible with $\pi$ in the sense that any cycle of $\pi$ corresponds to a cycle on the subgraph.

For some examples of compatibility, the permutations $(123)$, $(123)(57)$, $(57)$, and the identity are compatible with the out-edge subgraph drawn above at left. The permutations $(365)$ and the identity are compatible with the subgraph above at right.

It follows that we can rewrite the determinant as: $$\det L_0(D,n)=\sum_{(G,\pi)} (-1)^{r(\pi)}$$ where $r(\pi)$ is the number of nontrivial cycles of the $\pi$, and where the sum ranges over all pairs $(G,\pi)$ where $G$ is an out-edge subgraph and $\pi$ is a permutation compatible with $G$. (Note that the same out-edge subgraph $G$ may occur several times, paired with different permutations $\pi$.)

We finally construct a sign-reversing involution on the compatible pairs $(G,\pi)$ that cancel all the negative entries in the sum above. In particular, if $G$ has no cycles then send $(G,\pi)$ to itself, and otherwise consider the cycle $C$ in $G$ containing the vertex with the smallest label among all cycles in $G$. Define $\pi’$ by removing $C$ from $\pi$ if $\pi$ contains the cycle $C$, and otherwise adding $C$ to $\pi$ (in other words, toggle whether the elements of $C$ form a cycle or are all fixed points in the permutation). Then $\pi’$ is still compatible with $G$, so we can map $(G,\pi)$ to $(G,\pi’)$ in this case. This forms a sign-reversing involution in which the only non-canceling terms come from the pairs $$(T,\mathrm{id})$$ where $T$ is an out-edge subgraph with no cycles and $\mathrm{id}$ is the identity permutation.

Since a non-cyclic out-edge subgraph on $v_1,\ldots,v_{n-1}$ must be rooted at $v_n$ (for otherwise it would have a cycle), we can conclude that $\det L_0(D,n)$ is the number of spanning trees of $D$ rooted at $v_n$.

On Raising Your Hand

A few weeks ago I attended the AWM (Association of Women in Mathematics) Research Symposium in Houston, TX. I gave a talk in my special session, speaking on queer supercrystals for the first time, to a room full of female mathematicians.

I was a bit disappointed when, at the end of my talk, no one raised their hand to ask any questions.  It’s usually the classic sign of an uninteresting or inappropriately aimed talk, so I figured that maybe I had to revisit my slides and make them more accessible for the next time I spoke on the subject.

Afterwards, however, several of the women in my session came up to me privately to ask specific questions about my research.  When I told my husband about this after the conference, he pointed out that perhaps they just were the kind of people to prefer asking questions one-on-one rather than raising their hands during or after the lecture.

“Did anyone in your session ask questions after the other talks?” he asked me, testing his theory.

I thought about it, and was surprised when I realized the answer.  “Woah, I think you’re right,” I said.  “I asked at least one question after nearly every talk.  But I think I was the only one.  Once in a while one other woman would ask something too.  But the rest kept their hands down and went up to the speaker during the break to ask their questions.”

Upon further reflection, I realized that this was even true during the plenary talks.  During an absolutely fantastic lecture by Chelsea Walton, I was intrigued by something she said.  She mentioned that the automorphism group of the noncommutative ring $$\mathbb{C}\langle x,y\rangle/(xy-qyx)$$ is $\mathbb{C}^{\times} \times \mathbb{C}^{\times}$ for all $q\neq \pm 1$, but the answer is different at $q=1$ and $q=-1$.  I knew that many of the standard $q$-analogs arise naturally in computations in this particular ring, such as the $q$-numbers $$[n]_q=1+q+q^2+\cdots +q^{n-1}.$$ So, I wondered if the exceptions at $q=1$ and $q=-1$ were happening because $q$ was a root of unity, making some of the $q$-numbers be zero.  So maybe she was considering $q$ as a real parameter?  I raised my hand to ask.

“Is $q$ real or complex in this setting?”

“It’s complex,” Chelsea answered.  “Any nonzero complex parameter $q$.”

“Really?” I asked. “And there are no exceptions at other roots of unity?”

“Nope!” she replied with a smile, getting excited now.  “Just at $1$ and $-1$.  The roots of unity get in your way when looking at the representation theory.  But for the automorphism group, there are only two exceptional values for $q$.”  Fascinating!

No one else asked any mathematical questions during or after that talk.

Now, I have the utmost faith in womankind.  And I would normally have chalked the lack of questions and outspokenness up to it being a less mathematically cohesive conference than most, because the participants were selected from only a small percentage of mathematicians (those that happened to be female).  But it reminded me of another time, several years ago, that I had been surprised to discover the same phenomenon among a group of women in mathematics.

One summer I was visiting the Duluth REU, a fantastic research program for undergraduates run by Joe Gallian in the beautiful and remote city of Duluth, Minnesota.  As a former student at the program myself, I visited for a couple of weeks to hang out and talk math with the students.  I attended all the weekly student talks, and as usual, participated heavily, raising my hand to ask questions and give suggestions.

The day before I left, Joe took me aside.  “I wanted to thank you for visiting,” he said.  “Before you came, the women never raised their hand during the other students’ talks.  But after they saw you doing it, suddenly all of them are participating and raising their hands!”

I was floored.  I didn’t know that being a woman had anything to do with asking questions.

I have always felt a little out of place at AWM meetings.  They are inevitably host to many conversations about the struggles faced by women in competitive male-dominant settings, which I have never really related to on a personal level.  I love the hyper-competitive setting of academia.  I live for competition; I thrive in it.  And it never occurs to me to hold back from raising my hand, especially when I’m genuinely curious about why $q$ can be a complex root of unity without breaking the computation.

But, clearly, many women are in the habit of holding back, staying in the shadows, asking their questions in a one-on-one setting and not drawing attention to themselves.  And I wonder how much this phenomenon plays a role in the gender imbalance and bias in mathematics.

At the reception before the dinner at the AWM conference, I spotted Chelsea.  She was, unsurprisingly, quite popular, constantly engaged in conversation with several people at once.  I eventually made my way into a conversation in a group setting with her in it, and I introduced myself.

“Hi, I just wanted to say I really enjoyed your talk!  I was the one asking you whether $q$ was real.”

Her expression suddenly shifted from ‘oh-no-not-another-random-person-I-have-to-meet’ to a warm, smiling face of recognition.  “Oh!  I liked your question!” she exclaimed.  The conversation immediately turned to math, and she was nice enough to walk me through enough computations to convince me that $q=\pm 1$ were special cases in computing the automorphism group of the noncommutative ring.  (See Page 2 of this post for the full computation!)

The entire experience got me thinking.  It was because I raised my hand that Chelsea recognized me, that she was happy to talk to me and mathematics was communicated.  It was because I raised my hand that I got the question out in the open so that other participants could think about it as well.  It was because I raised my hand that women were doing mathematics together.  And perhaps it is because I raise my hand that I have no problem interacting in a male-dominant environment.  After all, they raise their hands all the time.

It is tempting to want to ask the men in mathematics to take a step back and let the women have the limelight once in a while.  But I don’t think that’s the answer in this case.  Men should keep raising their hands.  It’s part of how mathematics gets done.  It helps to communicate ideas more efficiently, to the whole room at once rather than only in private one-on-one settings.  It draws visibility to the interesting aspects of a talk that other participants may not have thought of.

What we really need is for women to come out of the shadows.  So, to my fellow women in mathematics: I’m calling on all of us to ask all our questions, to engage with the seminar room, to not hold back in those immensely valuable times when we are confused.  And raise our hands!

Schubert Calculus mini-course

I’ve written a lot about Schubert calculus here over the last few years, in posts such as Schubert Calculus, What do Schubert Curves, Young tableaux, and K-theory have in common? (Part II) and (Part III), and Shifted partitions and the Orthogonal Grassmannian.

I soon found out that writing a lot about your favorite topics on a blog can sometimes get you invited to give talks on said topics. This past June, I gave one of the three graduate mini-courses at the Equivariant Combinatorics workshop at the Center for Mathematics Research (CRM) in Montreal.

Naturally, a lot of what I covered came from the blog posts I already had written, but I also prepared more material on flag varieties and Schubert polynomials, more details on the cohomology of the Grassmannian, and more problems and examples suitable for the workshop. So I organized all of this material into a single lecture notes document available here:

Variations on a Theme of Schubert Calculus

The coolest part was that the CRM had good quality audio/video setup for all three workshops, and so you can also view my lecture videos that accompany these notes at the following five links, and take the entire course yourself:

Lecture 1: Introduction, Projective varieties, Schubert cells in the Grassmannian

Lecture 2: Duality theorem, CW complexes, Homology/cohomology

Lecture 3: Littlewood-Richardson rule, Flag variety, Schubert polynomials

Lecture 4: Cohomology of flag variety, generalized flag varieties

Lecture 5: The orthogonal Grassmannian, recap

You can also find the videos for the other two mini-courses from the summer – Steven Griffeth’s lectures on Cherednik algebras and Jeffrey Remmel’s lectures on symmetric function theory – at this link.

Hope someone out there finds these lectures useful or interesting. I certainly had a blast teaching the course!

Writing polynomials in terms of invariants and coinvariants

There is a fun little fact regarding polynomials in two variables $x$ and $y$:

Any two-variable polynomial $f(x,y)$ can be uniquely written as a sum of a symmetric polynomial and an antisymmetric polynomial.

(To be more precise, this is true for polynomials over any field of characteristic not equal to $2$. For simplicity, in what follows we will assume that our polynomials have coefficients in $\mathbb{C}$.)

Recall that a polynomial $g$ is symmetric if it does not change upon permuting its variables. In this case, with two variables, $g(x,y)=g(y,x)$. It is antisymmetric if swapping any two of the variables negates it, in this case $g(x,y)=-g(y,x)$.

It is not hard to prove the fact above. To show existence of the decomposition, set $g(x,y)=\frac{f(x,y)+f(y,x)}{2}$ and $h(x,y)=\frac{f(x,y)-f(y,x)}{2}$. Then $$f(x,y)=g(x,y)+h(x,y),$$ and $g$ is symmetric while $h$ is antisymmetric. For instance, if $f(x,y)=x^2$, then we can write $$x^2=\frac{x^2+y^2}{2}+\frac{x^2-y^2}{2}.$$

For uniqueness, suppose $f(x,y)=g_0(x,y)+h_0(x,y)$ where $g_0$ is symmetric and $h_0$ is antisymmetric. Then $g_0+h_0=g+h$, and so $$g_0-g=h-h_0.$$ The left hand side of this equation is symmetric and the right hand side is antisymmetric, and so both sides must be identically zero. This implies that $g_0=g$ and $h_0=h$, so the unique decomposition is $f=g+h$. QED.

This got me thinking…

Is there an analogous decomposition for polynomials in three variables? Or any number of variables?

The above decomposition doesn’t make sense in three variables, but perhaps every polynomial in $x$, $y$, and $z$ can be written uniquely as a sum of a symmetric, antisymmetric, and… some other particular type(s) of polynomials.

Indeed, it can be generalized in the following sense. Notice that any antisymmetric polynomial in two variables is divisible by $x-y$, since setting $x=y$ gives us $f(x,x)=-f(x,x)=0$. Moreover, dividing by $x-y$ gives a symmetric polynomial: if $$h(x,y)=p(x,y)\cdot(x-y)$$ is antisymmetric, then $p(x,y)\cdot (x-y)=-p(y,x)\cdot(y-x)$, and so $p(x,y)=p(y,x)$.

Thus any antisymmetric polynomial $h$ is equal to $x-y$ times a symmetric polynomial, and so we can restate our fact above in the following way:

Any two variable polynomial $f(x,y)$ can be written uniquely as a linear combination of $1$ and $x-y$, using symmetric polynomials as the coefficients.

For instance, $f(x,y)=x^2$ can be written as $$x^2=\left(\frac{x^2+y^2}{2}\right)\cdot 1+\left(\frac{x+y}{2}\right)\cdot (x-y).$$

Now, to generalize this to three variables, in place of $x-y$, we consider the polynomial $$\Delta=(x-y)(x-z)(y-z).$$ Also consider the five polynomials:

  • $x^2-z^2+2yz-2xy$,
  • $z^2-y^2+2xy-2xz$,
  • $x-y$,
  • $y-z$, and
  • $1$,

each of which is obtained by taking certain partial derivatives starting with $\Delta$. It turns out that every polynomial in three variables can be decomposed uniquely as a linear combination of these six polynomials, using symmetric polynomials as the coefficients!

Where do these six polynomials come from? Turn to the next page to find out…

A proof of the existence of finite fields of every possible order

This is our first contributed gemstone! Submitted by user Anon1.

In the following, $p$ denotes a prime. We wish to prove that, for all positive integers $n$, there is a finite field of order $p^{n}$.

Step 1. Restating the problem.

Claim: It suffices to show that, for some power of $p$ (call it $q$), there exists a finite field of order $q^{n}$.

Proof. Suppose there is a field $F$ such that $|F| = q^{n}$. The claim is that the solution set to $x^{p^{n}} = x$ in $F$ is a subfield of order $p^{n}$.

Since $q$ is a power of $p$, we have $$p^{n}-1 | q^{n}-1.$$ Since $q^{n}-1$ is the order of the cyclic group $F^{ \times }$, we know that $F^{ \times }$ has a unique cyclic subgroup of order $p^{n}-1$. This contributes $p^{n}-1$ solutions to $x^{p^{n}} = x$ in $F$. But $0$ is another solution, since it is not an element of $F^{ \times }$. This gives $p^{n}$ solutions, so this gives all solutions.

It remains to show that these solutions form a subfield of $F$. Since they form a finite set, it suffices to show that they are closed under addition and, since they include 0, it suffices to show they are closed under multiplication.

Closure under multiplication is immediate: $x^{p^{n}} = x$ and $y^{p^{n}} = y$ implies $$(xy)^{p^{n}} = x^{p^{n}}y^{p^{n}} = xy.$$

Closure under addition is similarly easy: Since $|F| = q$ and $q$ is a power of $p$, the field $F$ has characteristic $p$. This means that, for all $x,y \in F$, $$(x+y)^{p} = x^{p} + y^{p}.$$ Applying this $n$ times yields $(x+y)^{p^{n}} = x^{p^{n}} + y^{p^{n}}$. If $x^{p^{n}} = x$ and $y^{p^{n}} = y$, then it follows that $(x+y)^{p^{n}} = x + y$. We are done with Step 1. $\square$

Step 2. Reducing to an asymptotic problem.

Claim: There are arbitrarily large $k$ such that there is a finite field of order $q = p^{k}$.

Proof. If $f \in \mathbb{F}_{p}[x]$ is irreducible and of degree $k$, then $\mathbb{F}_{p}[x]/(f)$ is a field of order $p^{k}$. So we need to prove that the degrees of irreducible polynomials in $\mathbb{F}_{p}[x]$ get arbitrarily high. Since $\mathbb{F}_{p}[x]$ has only $(p-1)p^{k}$ polynomials of degree $k$, it suffices to prove that $\mathbb{F}_{p}[x]$ has infinitely many irreducible polynomials.

At this point we use Euclid’s argument: if $f_{1}, \ldots , f_{N}$ are all the irreducibles in $\mathbb{F}_{p}[x]$, then $1 + f_{1} \cdot \ldots \cdot f_{N}$ cannot be factored into irreducibles (and isn’t constant), contradicting the fact that $\mathbb{F}_{p}[x]$ is a unique factorization domain. Therefore $\mathbb{F}_{p}[x]$ has infinitely many irreducibles and we are done. $\square$

Step 3. Counting the irreducible polynomials of degree $d$ over various $\mathbb{F}_q$.

Claim: If $q$ is a power of $p$ for which there is a finite field of order $q$ (here denoted $\mathbb{F}_{q}$), then the number of irreducible monic polynomials of degree $d$ over $\mathbb{F}_{q}$ is a polynomial (call it $I_{d}(q)$) depending only on $d$, evaluated at $q$.

Proof. This proceeds by strong induction on $d$. The base case is $d = 1$: any linear polynomial is irreducible, and there are $q$ monic linear polynomials over $\mathbb{F}_{q}$.

Now suppose $d > 1$. There are $q^{d}$ monic degree $d$ polynomials over $\mathbb{F}_{q}$. To count how many are irreducible, we remove the ones which are reducible: If a monic polynomial is reducible, it can be written uniquely as a product of monic irreducibles of lower degree. These degrees add up to $d$, so they give a partition $\lambda$ of $d$ which can be any partition except $(d)$ itself.

Specifically, for each such partition $\lambda$ suppose $a_{i}$ is the number of parts of size $i$. Then
$$I_{d}(q) = q^{d} – \sum_{ \lambda \vdash d, \lambda \neq (d)}{ \prod_{i=1}^{d} \binom{I_{i}(q) + a_{i} – 1}{a_{i}} },$$ which is a polynomial in $q$ by the strong induction hypothesis. $\square$

Now, it suffices to show that for any given $n$, $I_n(q)>0$ for sufficiently large $q$. For, by step 2, this will give us a power of $p$ called $q$ for which $\mathbb{F}_q$ exists and such that there is a monic irreducible polynomial over $\mathbb{F}_q$ of degree $n$. By step 1 we are then done.

Since $I_{n}$ is a polynomial, this amounts to showing that $I_{n}$ has a positive leading coefficient for all $n$. This we do now:

Step 4. Showing that the counting polynomial $I_n$ has positive leading coefficient.

Claim: For all $n$, $I_{n}$ has degree $n$ and leading coefficient $1/n$.

Proof. Again we proceed by strong induction on $n$. When $n = 1$, we have $I_{n}(q) = q$ and we are done. So suppose $n>1$. Then as in step 3, we have
$$I_{n}(q) = q^{n} – \sum_{ \lambda \vdash n, \lambda \neq (n) }{ \Pi_{i=1}^{n} \binom{I_{i}(q) + a_{i} – 1}{a_{i}} }.$$
Since $a_{n} = 0$, we can restrict the products to only $i \lt n$. Then the degree of each of these products is $\sum_{i=1}^{n} i\cdot a_i=n$ by the induction hypothesis, and so $I_n(q)$ has degree at most $n$. Moreover, assuming by the induction hypothesis that the coefficient of $q^i$ in $I_i(q)$ is $1/i$, we have that the coefficient of $q^{i\cdot a_i}$ in $\binom{I_i(q)+a_i-1}{a_i}$ is $$\frac{1}{a_i!\cdot i^{a_i}}.$$

Finally, it follows that the coefficient of $q^n$ in $I_n(q)$ is $$1-\sum_{ \lambda \vdash n, \lambda \neq (n) }{ \Pi_{i=1}^{n}{ \frac{1}{a_{i}!i^{a_{i}}}}}.$$ Now, notice that for a given $\lambda$, the product $$z_\lambda=a_1!\cdot 1^{a_1}\cdot a_2!\cdot 2^{a_2}\cdot\cdots\cdot a_n!\cdot n^{a_n}$$ is the well-known demoninator of the formula $$n!/z_\lambda$$ for the size of the conjugacy class of elements of shape $\lambda$ in the symmetric group $S_n$. Thus we have $$\sum_{\lambda\vdash n} \frac{n!}{z_\lambda}=n!,$$ and so $$\sum_{\lambda\vdash n}\frac{1}{z_\lambda}=1.$$ It follows from this and the formula for our coefficient above that the coefficient is simply $$\sum_{\lambda=(n)}\frac{1}{z_\lambda}=1/n,$$ and the claim is proven. $\square$

Schubert Calculus

I recently gave a talk on Schubert calculus in the Student Algebraic Geometry Seminar at UC Berkeley. As a combinatorialist, this is a branch of enumerative geometry that particularly strikes my fancy. I also made a handout for the talk called “Combinatorics for Algebraic Geometers,” and I thought I’d post it here in blog format.


Enumerative geometry

In the late 1800’s, Hermann Schubert investigated problems in what is now called enumerative geometry, or more specifically, Schubert calculus. As some examples, where all projective spaces are assumed to be over the complex numbers:

  1. How many lines in $\newcommand{\PP}{\mathbb{P}}
    \DeclareMathOperator{\GL}{GL}\PP^n$ pass through two given points?
  2. Answer: One, as long as the points are distinct.

  3. How many planes in $\PP^3$ contain a given line $l$ and a given point $P$?
  4. Answer: One, as long as $P\not\in l$.

  5. How many lines in $\PP^3$ intersect four given lines $l_1,l_2,l_3,l_4$?
  6. Answer: Two, as long as the lines are in sufficiently “general” position.

  7. How many $(r-1)$-dimensional subspaces of $\PP^{m-1}$ intersect each of $r\cdot (m-r)$ general subspaces of dimension $m-r-1$ nontrivially?
  8. Answer: $$\frac{(r(m-r))!\cdot (r-1)!\cdot (r-2)!\cdot \cdots\cdot 1!}{(m-1)!\cdot(m-2)!\cdot\cdots\cdot 1!}$$

The first two questions are not hard, but how would we figure out the other two? And what do we mean by “sufficiently general position”?

Schubert’s 19th century solution to problem 3 above would have invoked what he called the “Principle of Conservation of Number,” as follows. Suppose the four lines were arranged so that $l_1$ and $l_2$ intersect at a point $P$, $l_3$ and $l_4$ intersect at $Q$, and none of the other pairs of lines intersect. Then the planes formed by each pair of crossing lines intersect at another line $\alpha$, which necessarily intersects all four lines. The line $\beta$ through $P$ and $Q$ also intersects all four lines, and it is not hard to see that these are the only two in this case.

Schubert would have said that since there are two solutions in this configuration and it is a finite number of solutions, it is true for every configuration of lines for which the number is finite by continuity. Unfortunately, due to degenerate cases involving counting with multiplicity, this led to many errors in computations in harder questions of enumerative geometry. Hilbert’s 15th problem asked to put Schubert’s enumerative methods on a rigorous foundation. This led to the modern-day theory known as Schubert calculus.

Describing moduli spaces

Schubert calculus can also be used to describe intersection properties in simpler ways. As we will see, it will allow us to easily prove statements such as:

The variety of all lines in $\PP^4$ that are both contained in a general $3$-dimensional hyperplane $S$ and intersect a general line $l$ nontrivially is isomorphic to the variety of all lines in $S$ passing through a specific point in that hyperplane.

(Here, the specific point in the hyperplane is the intersection of $S$ and $L$.)

A better way: Carlitz’s bijection

In my previous post on the major index, I mentioned the statistics $\DeclareMathOperator{\inv}{inv}
\DeclareMathOperator{\code}{code} \inv$ and $\maj$ on permutations, and the fact that they were equidistributed. I have since learned of a more enlightening way of proving this result, due to Carlitz.

Let’s start by recalling the definitions of $\inv$ and $\maj$. An inversion of a permutation of the numbers $1,2,\ldots,n$ is a pair of numbers which appear “out of order” in the sequence, that is, with the larger number appearing first. For instance, for $n=4$, the permutation $3,1,4,2$ has three inversions: $(3,1)$, $(3,2)$, and $(4,2)$. We write $$\inv(3142)=3.$$

For the major index, define a descent of a permutation to be an index $i$ for which the $i$th number is greater than the $(i+1)$st. The major index is defined to be the sum of the descents. For instance, the permutation $3,1,4,2$ has two descents, in positions $1$ and $3$, so the major index is $4$. We write $$\maj(3142)=4.$$

It turns out that
\sum_{w\in S_n} q^{\inv(w)} &=& (1)(1+q)(1+q+q^2)\cdots(1+q+q^2+\cdots+q^{n-1}) \label{eqn} \\
&=& \sum_{w\in S_n}q^{\maj(w)},
\end{eqnarray} so we say that $\inv$ and $\maj$ are equidistributed, that is, the number of permutations having $\maj w=k$ is equal to the number having $\inv w=k$ for all $k$.

Carlitz’s proof, which is explained beautifully on pages 5 and 6 of this paper of Mark Skandera, cleanly shows each equality directly, without first requiring any algebraic manipulation. (Carlitz’s original paper can be found here, and it seems that by creating a MyJSTOR account, you can now access this article online for free! Thumbs up to this recent step towards open access.)

The main idea of Carlitz’s bijection is to consider an intermediate combinatorial object called a permutation code. Given a permutation $w=w_1, w_2,\ldots,w_n$ of $\{1,2,\ldots,n\}$, define $\code(w)=a_1a_2\cdots a_n$ where $a_i$ is the number of entries of $w$ to the right of $w_i$ which are less than $w_i$. For instance, we have $$\code(41532)=30210,$$ since the $4$ is part of an inversion with three entries to its right, the $1$ with none, and so on. It is not hard to see that each permutation has a unique code, and a sequence is the code of some permutation if and only if its last entry is (at most) $0$, its second-to-last entry is at most $1$, its third-to-last entry is at most $2$, and so on.

To formalize this, we define a code of length $n$ to be a sequence of $n$ nonnegative integers such that the $i$th entry is no larger than $n-i$. There are clearly $n!$ codes (one choice for the last digit, two choices for the second-to-last, and so on), and furthermore the $q$-series that counts codes by the sum of their entries is given by
$$\sum_{c\text{ code}\\ |c|=n} q^{\sum c_i}=(1)(1+q)(1+q+q^2)\cdots(1+q+q^2+\cdots+q^{n-1}).$$
The bijection $\code$ described above sends a permutation $w$ with $\inv w=k$ to a code having sum $k$, and so this immediately shows that
\sum_{w\in S_n} q^{\inv w}&=&\sum_{c\text{ code} |c|=n} q^{\sum c_i} \\

So, it now suffices to find a bijection between codes of length $n$ and permutations of $n$, which I’ll call $\phi$, which sends the sum of the code to $\maj$ of the permutation. Given a code of length $n$, we read it backwards, and at each step insert the numbers $n, n-1,\ldots$ successively to form a permutation, so that at each step, the major index of the resulting word increases by precisely the code number read at each step. For example, consider the code $30210$ from above. We read it backwards, so the first entry is $0$, and we first place a $5$ in our permutation. So our first step yields the sequence: $$5$$ The next code entry from the right is $1$, so we must insert the $4$ in a way that increases the major index by exactly one. We see that the only way to do this is to insert it to the right of the $5$: $$5,4$$ The next code entry is $2$, so we insert the $3$ to the right of the $4$, as this is the only way to increase the major index by $2$: $$5,4,3$$ The next entry is $0$, and the only place to insert a $2$ so that the major index does not change is before the $3$. So our sequence is now:
$$5,4,2,3$$ Finally, we insert the $1$ so as to increase the major index by $3$:

So $\phi(30210)=54213.$ It is not hard to show that there is always exactly one valid insertion of the next number at each step in the process, and hence $\phi$ is a bijection, with the property that $\sum(c)=\maj(\phi(c))$ for any code $c$ of length $n$. This implies immediately that
\sum_{w\in S_n} q^{\maj w}&=&\sum_{c\text{ code} |c|=n} q^{\sum c_i} \\
\end{eqnarray*} and the desired equation follows.

Yet another definition of the Schur functions

I continue to be amazed at the multitude of different contexts in which the Schur functions naturally appear.

In a previous post, I defined the Schur symmetric functions combinatorially, via the formula $$s_\lambda=\sum_{|\mu|=|\lambda|}K_{\lambda\mu} m_\mu$$ where $K_{\lambda\mu}$ is the number of semistandard Young tableaux of shape $\lambda$ and content $\mu$ and $m_\mu$ is the monomial symmetric function of shape $\lambda$. I also defined them as the ratio $$s_\lambda=\frac{a_{\lambda+\delta}}{a_\lambda}$$ where $a_\lambda$ is the elementary antisymmetric function.

And, in another post, I pointed out that the Frobenius map sends the irreducible characters of the symmetric group $S_n$ to the Schur functions $s_\lambda$. This can be taken as a definition of the Schur functions:

The Schur functions $s_\lambda$, for $|\lambda|=n$, are the images of the irreducible representations of $S_n$ under the Frobenius map.

Today, I’d like to introduce an equally natural representation-theoretic definition of the Schur functions:

The Schur functions $s_\lambda$, for $\lambda$ having at most $n$ parts, are the characters of the polynomial representations of the general linear group $GL_n(\mathbb{C})$.

I recently read about this in Fulton’s book on Young Tableaux, while preparing to give a talk on symmetric function theory in this term’s new seminar on Macdonald polynomials. Here is a summary of the basic ideas (turn to page 2):

Combinatorial species

Just last month, I took my Ph.D. qualifying exam at Berkeley. It was a grueling three-hour oral exam, but from it came a gemstone that simply must be shared!

The very first question I was asked was from my advisor, Mark Haiman: “Enumerate [count] the labeled plane trees on $n$ vertices.”

More precisely, let $n$ be a positive integer, and define a labeled plane tree to be a tree with vertices labeled $1,2,\ldots,n$, along with a cyclic ordering of the branches attached to each vertex. (The name “plane tree” refers to the fact that the orderings of the branches determine an embedding of the tree into Euclidean space up to homotopy equivalence.) For instance, say we had the following trees drawn in the plane:


They are isomorphic as graphs but not as plane trees, because the branches around the vertex $1$ have different clockwise cyclic orderings.

So how do we count these trees?

Let’s get our hands dirty and start counting them by brute force. Letting $p_n$ denote the number of distinct plane trees up to isomorphism, it isn’t hard to find that $p_0=0$, $p_1=1$, $p_2=1$, $p_3=3$, $p_4=20$, and with a bit more work, $p_5=210$. Hmm. No obvious pattern jumps out immediately, and the computation gets difficult very quickly as $n$ increases.

Perhaps we can find a generating function for $p_n$ instead. And here is where species play a role.