Finding Gemstones: on the quest for mathematical beauty and truth
< Previous Post Next Post >

FindStat and combinatorial statistics

Last semester, I attended Sage Days 54 at UC Davis. In addition to learning about Sage development (perhaps a topic for a later blog post), I was introduced to FindStat, a new online database of combinatorial statistics.

You may be familiar with the Online Encyclopedia of Integer Sequences; the idea of FindStat is similar, and somewhat more general. The Online Encyclopedia of Integer Sequences is a database of mathematically significant sequences, and to search the database you can simply enter a list of numbers. It will return all the sequences containing your list as a consecutive subsequence, along with the mathematical significance of each such sequence and any other relevant information.

FindStat does the same thing, but with combinatorial statistics instead of sequences. A combinatorial statistic is any integer-valued function defined on a set of combinatorial objects (such as graphs, permutations, posets, and so on). Some common examples of combinatorial statistics are:

  • The number of edges of a finite simple graph,
  • The length of a permutation, that is, the smallest length of a decomposition of the permutation into transpositions,
  • The number of parts of a partition,
  • The diameter of a tree.

The FindStat database has a number of combinatorial objects programmed in, with various statistics assigned to them, which can all be viewed in the Statistics Database tab. The search functionality is under Statistic Finder, in which you can choose a combinatorial object, say graphs, and enter some values for some of the graphs. It will then tell you what statistics, if any, on graphs match the values you have entered.

So this is strictly more general than OEIS: we can think of integer sequences as combinatorial statistics on some collection of combinatorial objects represented by the nonnegative integers, such as finite collections of indistinguishable balls. Not that FindStat should be used for integer sequences - OEIS already does a splendid job of that - but FindStat provides something that OEIS cannot: an organized database of mathematical data that doesn’t necessarily have a natural linear ordering.

The last, and most interesting, feature of FindStat is its ``maps” functionality. There are many known natural maps of combinatorial objects, such as the map $\phi:P\to B$ sending a permutation to its corresponding binary search tree, where $P$ denotes the set of all permutations and $B$ the set of all binary search trees. (See here for all the maps currently implemented on the Permutations class in FindStat.) Now, given a statistic $s:B\to \mathbb{Z}$ on $B$, we automatically get a statistic \[s\circ \phi:P\to \mathbb{Z}.\] FindStat uses this fact to give the user more information: it will give you not only the matching statistics on the combinatorial object that you chose, but the matching statistics on all other possible combinatorial objects linked by any relevant map in the database! This can help the working combinatorialist discover new ways of thinking about their statistics.