Finding Gemstones: on the quest for mathematical beauty and truth
< Previous Post Next Post >

Writing polynomials in terms of invariants and coinvariants

There is a fun little fact regarding polynomials in two variables $x$ and $y$:

Any two-variable polynomial $f(x,y)$ can be uniquely written as a sum of a symmetric polynomial and an antisymmetric polynomial.

(To be more precise, this is true for polynomials over any field of characteristic not equal to $2$. For simplicity, in what follows we will assume that our polynomials have coefficients in $\mathbb{C}$.)

Recall that a polynomial $g$ is symmetric if it does not change upon permuting its variables. In this case, with two variables, $g(x,y)=g(y,x)$. It is antisymmetric if swapping any two of the variables negates it, in this case $g(x,y)=-g(y,x)$.

It is not hard to prove the fact above. To show existence of the decomposition, set $g(x,y)=\frac{f(x,y)+f(y,x)}{2}$ and $h(x,y)=\frac{f(x,y)-f(y,x)}{2}$. Then \[f(x,y)=g(x,y)+h(x,y),\] and $g$ is symmetric while $h$ is antisymmetric. For instance, if $f(x,y)=x^2$, then we can write \[x^2=\frac{x^2+y^2}{2}+\frac{x^2-y^2}{2}.\]

For uniqueness, suppose $f(x,y)=g_0(x,y)+h_0(x,y)$ where $g_0$ is symmetric and $h_0$ is antisymmetric. Then $g_0+h_0=g+h$, and so \[g_0-g=h-h_0.\] The left hand side of this equation is symmetric and the right hand side is antisymmetric, and so both sides must be identically zero. This implies that $g_0=g$ and $h_0=h$, so the unique decomposition is $f=g+h$. QED.

This got me thinking…

Is there an analogous decomposition for polynomials in three variables? Or any number of variables?

The above decomposition doesn’t make sense in three variables, but perhaps every polynomial in $x$, $y$, and $z$ can be written uniquely as a sum of a symmetric, antisymmetric, and… some other particular type(s) of polynomials.

Indeed, it can be generalized in the following sense. Notice that any antisymmetric polynomial in two variables is divisible by $x-y$, since setting $x=y$ gives us $f(x,x)=-f(x,x)=0$. Moreover, dividing by $x-y$ gives a symmetric polynomial: if \[h(x,y)=p(x,y)\cdot(x-y)\] is antisymmetric, then $p(x,y)\cdot (x-y)=-p(y,x)\cdot(y-x)$, and so $p(x,y)=p(y,x)$.

Thus any antisymmetric polynomial $h$ is equal to $x-y$ times a symmetric polynomial, and so we can restate our fact above in the following way:

Any two variable polynomial $f(x,y)$ can be written uniquely as a linear combination of $1$ and $x-y$, using symmetric polynomials as the coefficients.

For instance, $f(x,y)=x^2$ can be written as \[x^2=\left(\frac{x^2+y^2}{2}\right)\cdot 1+\left(\frac{x+y}{2}\right)\cdot (x-y).\]

Now, to generalize this to three variables, in place of $x-y$, we consider the polynomial \[\Delta=(x-y)(x-z)(y-z).\] Also consider the five polynomials:

  • $x^2-z^2+2yz-2xy$,
  • $z^2-y^2+2xy-2xz$,
  • $x-y$,
  • $y-z$, and
  • $1$,

each of which is obtained by taking certain partial derivatives starting with $\Delta$. It turns out that every polynomial in three variables can be decomposed uniquely as a linear combination of these six polynomials, using symmetric polynomials as the coefficients!

Where do these six polynomials come from?.


To understand what’s really going on here, we need to consider the ring of coinvariants for the action of the symmetric group $S_n$ on $\mathbb{C}[x_1,\ldots,x_n]$, where a permutation of $\{1,\ldots,n\}$ acts by permuting the variables accordingly.

An invariant is any polynomial fixed under this action, so it’s just another name for a symmetric function here. Recall that the ring of symmetric functions is generated as an algebra by the power sum symmetric functions $p_1,\ldots,p_n$ where \[p_i=x_1^i+x_2^i+\cdots+x_n^i.\] So, we say that the ring of coinvariants is the quotient ring \[\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n)\] by the ideal generated by the invariants having no constant term.

In the case of two variables $x$ and $y$, the ring of coinvariants is simply \[\mathbb{C}[x,y]/(x+y,x^2+y^2).\] Notice that this ideal $I=(x+y,x^2+y^2)$ is also generated by the elementary symmetric functions $x+y$ and $xy$. Now, given any monomial $x^ky^j$, it is zero modulo $I$ unless one of $j$ or $k$ is zero, and we also have $x^k=-x^{k-1}y$ modulo I. So every monomial is zero modulo I except for $1$, $x$, and $y$. Since $x+y=0$ we have $x=-y$, so in fact the two polynomials $1$ and $x$ generate the ring of coinvariants as a $\mathbb{C}$-vector space.

Equivalently, $1$ and $x-y$ generate the ring of coinvariants in two variables (sound familiar?), and the submodules they generate are $S_2$-invariant: the former is the trivial representation and the second is the alternating (sign) representation. (See this post for more information on the characters of the symmetric group). Thus the coinvariant ring decomposes as the direct sum of the trivial and sign representations of $S_2$, and we see that it is therefore isomorphic to the regular representation, the representation formed by $S_2$ acting on itself by left composition of permutations.

In fact, this holds in general:

The ring of coinvariants $\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n)$ is isomorphic to the regular representation of $S_n$.

For a proof of this, see for instance Stanley’s paper on invariants of finite groups, Proposition 4.9. Rather than going into the proof, however, let’s focus on how this relates to the decomposition above.

There is a nice fact called Chevalley’s theorem that works for finite groups generated by reflections. For instance, $S_n$ is generated by transpositions, which can be realized as reflections across hyperplanes. One form of Chevalley’s theorem states that if such a group $G$ acts on the ring $R=\mathbb{C}[x_1,\ldots,x_n]$, then $R$ is a finitely generated free module over the ring of invariants $R^G$. Moreover, $R$ is generated as an $R^G$-module by any set of coset representatives for $R/(R^G_+)$, where $R^G_+$ (See, for instance, Humphrey’s book, section 3.6. Thanks to Francois Bergeron for pointing it out to me!)

In the case of $S_n$, we can state the above as the following tensor product decomposition: \[\mathbb{C}[x_1,\ldots,x_n]=\mathbb{C}[p_1,\ldots,p_n]\otimes_{\mathbb{C}[S_n]}\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n)\] where $\mathbb{C}[S_n]$ denotes the group algebra over $S_n$. Tensoring with the subring of symmetric functions has the effect of ``changing basis’’ to make the ring of coinvariants into a module over the symmetric functions, and Chevalley’s theorem says that this module is exactly the entire ring $\mathbb{C}[x_1,\ldots,x_n]$, considered as a module over its subring $\mathbb{C}[p_1,\ldots,p_n]$.

Going back to the case of two variables, this decomposition becomes \[\mathbb{C}[x,y]=\Lambda_{C}(x,y)\otimes_{\mathbb{C}[S_n]} \mathbb{C}[x,y]/(x+y,xy),\] where $\Lambda_{C}(x,y)=\mathbb{C}[p_1,p_2]$ denotes the ring of symmetric functions in $x$ and $y$. Since $1$ and $x-y$ form a basis for $\mathbb{C}[x,y]/(x+y,xy)$, this can be interpreted as the fact that every polynomial in $\mathbb{C}[x,y]$ can be written uniquely in the form $f(x,y)\cdot 1+ g(x,y)\cdot (x-y)$ where $f$ and $g$ are symmetric.

So, we now need only to generalize the polynomials $1$ and $x-y$ to higher numbers of variables. Our question becomes:

What is a natural basis for \[\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n)?\]

It turns out that there is a nice explicit basis for $\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n)$: the Vandermonde determinant $\Delta$ and all its partial derivatives. The $n\times n$ Vandermonde determinant is the determinant of the following matrix: \[\left(\begin{array}{cccc} 1 & 1 & \cdots & 1 \\ x_1 & x_2 & \cdots & x_n \\ \vdots & \vdots & \ddots & \vdots \\ x_1^{n-1} & x_2^{n-1} & \cdots & x_n^{n-1} \end{array}\right)\]

Explicitly, it equals $\Delta=\prod_{i < j}(x_j-x_i)$. So in the case of two variables, $\Delta=x-y$, and in the case of three variables, $\Delta=(x-y)(y-z)(x-z)$.

Now, consider all partial derivatives of $\Delta$, that is, all polynomials of the form \[\frac{\partial}{\partial x_{i_1}}\frac{\partial}{\partial x_{i_2}}\cdots\frac{\partial}{\partial x_{i_k}} \Delta\] for some $k$. The claim is that a subset of these form a $\mathbb{C}$-basis for the ring of coinvariants; to do so, we need to show that there are enough of them that are mutually linearly independent.

We begin by computing the dimension of the $d$th graded piece of the coinvariant ring for all $d$. Consider again the decomposition \[\mathbb{C}[x_1,\ldots,x_n]=\mathbb{C}[p_1,\ldots,p_n]\otimes_{\mathbb{C}[S_n]}\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n),\] and consider the Hilbert series of both sides. (Recall that the Hilbert series of a graded ring $R=\oplus_d R_d$ is the generating function $H_R(q)=\sum_d \dim_{\mathbb{C}}(R_d) q^d$.) On the left hand side, it is a simple combinatorial exercise to see that the Hilbert series of $\mathbb{C}[x_1,\ldots,x_n]$ is \[\frac{1}{(1-q)^n}.\] On the right hand side, recall that the bases for the ring of symmetric functions in $n$ variables are indexed by partitions with at most $n$ parts. So if $p_n(d)$ denotes the number of partitions of $d$ having at most $n$ parts, then the Hilbert series of $\mathbb{C}[p_1,\ldots,p_n]$ is the generating function \[\sum_d p(d)q^d=\frac{1}{(1-q)(1-q^2)\cdots(1-q^n)}.\]

Finally, since Hilbert series are multiplicative across tensor products, we have the equation \[\frac{1}{(1-q)^n}=\frac{1}{(1-q)(1-q^2)\cdots(1-q^n)}\operatorname{Hilb}(\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n)).\]

Solving, we find \[\operatorname{Hilb}(\mathbb{C}[x_1,\ldots,x_n]/(p_1,\ldots,p_n))=\frac{(1-q)(1-q^2)\cdots(1-q^n)}{(1-q)^n}=(n)_q!,\] which is the $q$-factorial number. This means that, in particular, the highest degree elements in the coinvariant ring have degree $\binom{n}{2}$, and this degree component has dimension $1$. Thus the Vandermonde determinant $\Delta$ spans the top degree component, and as an $S_n$-module it is a single copy of the sign representation. (The Hilbert series formula also immediately implies that the dimension of the entire coinvariant ring is $n!$.)

Finally, to show that the partial derivatives of $\Delta$ span the ring of coinvariants, we consider its dual correspondence with the ring of harmonics. Consider the inner product on $\mathbb{C}[x_1,\ldots,x_n]$ given by \[\langle f,g \rangle=(\partial_f(g))_0,\] where $\partial_f$ denotes the differential operator formed by replacing any $x_i$ in the polynomial $f$ by $\frac{\partial}{\partial x_i}$, and the subscript $0$ indicates that we take the constant term. This is a well-defined inner product having an orthogonal basis consisting of the monomials. Under this inner product, we can consider the orthogonal subspace to the ideal $I=(p_1,\ldots,p_n)$.

The orthogonal subspace can be described explicitly as the space of all functions $f$ which are killed by the ``power sum operators’’ $\frac{\partial^k}{\partial x_1^k}+\cdots+\frac{\partial^k}{\partial x_n^k}$ for all $k$. This space is called the ring of harmonics, and is isomorphic to the ring of coinvariants since it is the orthogonal space to the ideal $I$ that we quotient by.

Finally, since the Vandermonde determinant $\Delta$ is in the ring of harmonics, all its partial derivatives are also elements of it since the partial derivative operators commute. Moreover, if we consider only the partial derivatives of the form \[\frac{\partial^{a_1}}{\partial x_1^{a_1}}\cdots \frac{\partial^{a_n}}{\partial x_n^{a_n}}\Delta\] in which $a_i\le i-1$ for all $i$, then there are exactly $n!$ of them, and their leading terms in lexicographic order are all distinct. Thus they are independent and generate the ring of harmonics, which has dimension $n!$.

(Combinatorial Bonus: Notice that this choice of partial derivatives corresponds with the Carlitz Codes described in this post, which are $q$-counted by the $q$-factorial.)

To sum up, it is possible to generalize the nice decomposition of polynomials in two variables into a symmetric and antisymmetric component, by using $n!$ components for the $n$-variable case. These components correspond to the $n!$ partial derivatives of $\Delta=\prod_{i < j} (x_i-x_j)$ that come from the Carliz codes $a_1,\ldots,a_n$ with $a_i\le i-1$ for each $i$. Mystery solved!