Finding Gemstones: on the quest for mathematical beauty and truth
< Previous Post Next Post >

The Fundamental Theorem of Symmetric Function Theory

This little gemstone is hidden within the folds of algebraic combinatorics, but certainly deserves its name. Just as the Fundamental Theorem of Arithmetic gives us a way of writing common objects (numbers) in a canonical form (prime factorization), the Fundamental Theorem of Symmetric Function Theory allows us to express any symmetric function in a useful canonical form.

First, some background on symmetric functions. Like generating functions, symmetric functions arise naturally in combinatorics, and the coefficients are often the point of interest. In the poetic words of Herbert Wilf in his book generatingfunctionology:

A generating function is a clothesline on which we hang up a sequence of numbers for display.

I claim, and aim to demonstrate here, that

A symmetric function is a web of clotheslines on which we hang up a collection of numbers that are indexed by partitions.

Okay, so the clothesline might be a bit more complicated in this case. But it is much more useful for areas of mathematics in which partitions play an important role.

Let’s be more rigorous. A symmetric function (over $\mathbb{Q}$) is a polynomial or series in many variables with rational coefficients which is invariant (unchanging) under permutations of the variables. For instance, $x^2 + y^2 + z^2 + 2xyz$ is a symmetric function in three variables, but $x^2 y + y^2 z + z^2 x$ is not, because interchanging $x$ and $y$ does not leave the function unchanged.

I find that writing out a symmetric function explicitly as a polynomial, like those above, is similar to expressing a number in base ten - sometimes it is useful, but other times you need its prime factorization. The “primes” of symmetric function theory are the elementary symmetric functions.

First, define $e_n(x_1,\ldots,x_r)=m_{(1^n)}=\sum_{i_1<\cdots<i_n} x_{i_1}\cdots x_{i_n}.$ For instance, \[e_3(a,b,c,d)=abc+abd+acd+bcd.\] Let $\lambda=(\lambda_1,\lambda_2,\ldots,\lambda_k)$ be a partition, that is, a finite sequence of positive integers written in nonincreasing order. Then the elementary symmetric function corresponding to $\lambda$ is defined to be the product \[e_\lambda=e_{\lambda_1}e_{\lambda_2}\cdots e_{\lambda_k}.\] The remarkable fact is that the symmetric functions $e_\lambda$ form a basis for the $\mathbb{Q}$-module of symmetric functions! In other words:

Fundamental Theorem of Symmetric Function Theory: Every symmetric function can be written uniquely in the form $\sum_{\lambda} c_{\lambda}e_\lambda$, where each $c_\lambda\in \mathbb{Q}$.

So, the $e_\lambda$`s are a clothesline of sorts, indexed by the partitions, on which to hang coefficients.

For example, let’s try to write the symmetric function $x^3+y^3+z^3$ in terms of elementary symmetric functions. We have the functions $e_1=x+y+z$, $e_2=xy+yz+zx$, and $e_3=xyz$ to work with. We can start by cubing $e_1$: \[e_1^3=x^3+y^3+z^3+3(x^2y+y^2x+x^2z+z^2x+y^2z+z^2y)+6xyz\] We can replace the $6xyz$ with $6e_3$. We now only need to write the term $3(x^2y+y^2x+x^2z+z^2x+y^2z+z^2y)$ in terms of $e_i$`s. The product $3e_1e_2$ is pretty close, but leaves us with an extra $9xyz$ that we need to subtract out, giving us $3e_1e_2-9e_3$ for the middle term. Putting this all together and solving for $x^3+y^3+z^3$, we find:

\[x^3+y^3+z^3=e_1^3-3e_1e_2+3e_3=e_{(1,1,1)}-3e_{(2,1)}+3e_{(3)}.\] It is fairly clear that a similar process can be used to express any symmetric function in terms of $e_{\lambda}$`s. But why is it unique?

Perhaps the simplest and most essential type of symmetric functions are the monomial symmetric functions. Then we define $m_\lambda$ to be the smallest symmetric function containing the monomial $x_1^{\lambda_1}x_2^{\lambda_2}\cdots x_k^{\lambda_k}$ as a term.

For example, the monomial symmetric function $m_{(2,1)}$ in three variables is \[m_{(2,1)}=x^2 y + y^2 x + x^2 z + z^2 x + y^2 z + z^2 y,\] since it is the smallest symmetric function containing the monomial $x^2y$ as a term. (Notice that the meaning changes with the number of variables: in two variables we have $m_{(2,1)}=x^2y+y^2x$, and in only one variable, $m_{(2,1)}=0$.)

The monomial symmetric functions are also enough to describe every symmetric function:

Every symmetric function can be written uniquely in the form $\sum_{\lambda} a_{\lambda}m_\lambda$, where each $a_\lambda\in \mathbb{Q}$.

For instance, we can write

\[x^3y+ y^3x+2xy-3x^4-3y^4=m_{(3,1)}+2m_{(1,1)}-3m_{(4)}.\]

Notice that the coefficients of the terms having the same multiset of exponents, such as $x^3y$ and $y^3x$, must be equal for the function to be symmetric. It follows that every symmetric function is uniquely a sum of constant multiples of $m_\lambda$`s.

We can now prove the Fundamental Theorem by expressing the $e_\lambda$`s of degree $d$ in terms of the $m_\lambda$`s of degree $d$, and showing that the transition matrix is invertible. (Thanks to Mark Haiman for introducing me to this proof, which appears on the next page.)

A proof of the Fundamental Theorem, for the curious:

We start by expressing $e_\lambda$ in terms of monomial symmetric functions.

Notice that, in the expansion of the product $e_{\lambda_1} e_{\lambda_2} \cdots e_{\lambda_k}$, the coefficient of $x_1^{\alpha_1}x_2^{\alpha_2}\cdots x_l^{\alpha_l}$ is the number of ways of choosing one monomial from each $e_{\lambda_i}$ so that there are a total of $\alpha_j$ occurrences of $x_j$ for each $j$. Writing the exponents of these monomials as the rows of a matrix, it is simply the number of matrices of zeroes and ones with row sums $\lambda_i$ and column sums $\alpha_j$.

Grouping these monomials into $m_\lambda$`s, we now have \[e_\lambda=\sum_{\mu} M_{\lambda\mu} m_\mu\] where $M_{\lambda\mu}$ is the number of matrices of $0$`s and $1$`s with row sums $\lambda_i$ and column sums $\mu_j$.

We now only need to show that the matrix of coefficients $M_{\lambda\mu}$, for a given size of partition $n$, is invertible.

We say a partition $\lambda$ dominates $\mu$, written $\lambda\ge \mu$, if the partial sums of $\lambda$ are never less than those of $\mu$: $\lambda_1+\cdots+\lambda_i\ge \mu_1+\cdots \mu_i$ for all $i$. Dominance is a partial ordering on the partitions of $n$, and so we can extend it to some total ordering of the partitions, say $\lambda^{1}, \lambda^{2},\ldots$.

For a partition $\lambda$, let its conjugate, $\lambda^\ast$, be the partition formed by transposing its Young diagram. We will show that if we arrange the matrix of coefficients $M_{\lambda\mu}$ so that its rows are indexed by $\lambda^1,\lambda^2,\ldots$ and its columns by $(\lambda^1)^\ast, (\lambda^2)^\ast,\ldots$, then the matrix is upper triangular with $1$`s on the diagonal.

Indeed, we have $M_{\lambda\lambda^\ast}=1$, since the only matrix of $0$`s and $1$`s with row sums $\lambda_i$ and column sums $\lambda_j^\ast$ is the matrix with the $\lambda_i$ $1$`s in each row aligned all the way to the left, forming the Young diagram of $\lambda$ with the $1$’s.

It only remains to show that $M_{\lambda\mu}$ is $0$ for $\mu\not\le \lambda^\ast$. Suppose $M_{\lambda\mu}$ is nonzero. Then there is a matrix $A$ of $0$`s and $1$`s with row sums $\lambda_i$ and column sums $\mu_j$, which are clearly dominated by the column sums of the matrix formed by shifting all the $1$`s in $A$ to the left. But these new column sums are the parts of $\lambda^\ast$, and so $\mu\le\lambda^{\ast}$.

This completes the proof! Stay tuned - there are more symmetric function gemstones to be uncovered in future posts.