Doing mathematics in a pandemic – Part IV: Talks with OBS

This is the final post in a four-part series on adapting to the pandemic as a mathematician. See Part I – AlCoVEPart II – Collaboration, and Part III – Teaching.

As conferences moved online, a number of different methods of giving a remote talk became commonplace. One was to simply point a webcam at a chalkboard and lecture as usual. Another is to make slides and use the “Share Screen” option on Zoom to show the slides to the audience. Another popular method, which I have used a number of times, is to make partial handwritten “slides” in Notability or GoodNotes on an iPad, with space left for doing examples and computations, and then share the iPad screen over Zoom and walk the audience through.

Today I’ll be explaining how to use Open Broadcasting Software (OBS) to give a talk from home in which your slides show up behind you as if you were standing next to a projector screen, but are nearly as crisp as if you were reading the PDF on your own computer screen. François Bergeron at UQAM first introduced me to this method, and his COVID-19 page features excellent explanations of how he creates his own virtual talks.

First, here is what the output of my first and only attempt at using OBS in a virtual talk looked like:

The above talk was given at the Enumerative Combinatorics session of the virtual Canadian Math Society winter meeting in 2020. Other videos from this session are available here, including another example using OBS by Marni Mishna.

There are two steps to getting this working: (1) setting up your video sources in OBS, and (2) feeding the video output to Zoom.

Step 1: Video sources in OBS

The first step is to install OBS from obsproject.com. Once you install and open it, you’ll see a window with a preview of what your video project looks like. At the bottom of the windows are various menus: Scenes, Sources, Audio Mixer, etc.

You’ll only need one Scene, and you’ll add various Sources to put together the Scene, using the + button at the bottom of the Sources box. Here were the Sources I used:

  • Color Source. This allows you to essentially set a “Background color” for your scene. It defaults to a dark grey. Once you add the color source, you can right click it and click “Properties” to change its color.
  • Window Capture. This is the source that you can use to make your slides appear. Create a Window Capture source, open your slides in your pdf viewer in another window, and then go back to OBS and right click on Window Capture to select Properties. There you can tell it to use the pdf viewer as the window that you’re capturing in this source. Finally, go back to the pdf viewer and go into presentation mode, so that you can flip through the presentation with a clicker or keyboard as you would in a real classroom.

    Tip: If you’re on a Mac or another operating system that has multiple desktops, you may need to open your slides on the same desktop window as OBS is opened in for OBS to find the pdf viewer as a source. In particular, you need to feed it to Window Capture before going to full screen or presentation mode; otherwise OBS will not recognize it as a source.
  • Video Capture. This uses your laptop webcam or other webcam to capture your face. You can again right click to get to Properties to change the webcam you’re using if you wish. To filter out everything except your head, you want to use:
    • Chroma Key filter. Right click on the Video Capture source and go to “Filters”. Then click the + under the “Effect filters” box and click “Chroma Key”. You can then click on the Chroma Key filter to choose which color you want to filter out.

      Since I had a green screen (see Part III), I filtered out the color of my green screen behind my head, and voila, the video capture source only showed the outline of my head and nothing else. If you don’t have a green screen, make sure you’re positioned in front of a blank wall and then just choose the color of that wall to filter out.

Finally, make sure you order the above sources in the Sources box so that Video Capture is highest up, then Window Capture is second, then Color Source is third. This way your head appears in front of the slides which appears in front of the background color.

I resized my Window Capture layer by dragging its outline in the OBS preview window so that it sat in the upper left of the screen, leaving room for my head on the right as if I were standing next to the projector screen.

Here is a screen shot of OBS after setting up everything as above (plus an audio input source – see Optional Step 3 below):

Step 2: Connecting OBS output to Zoom

For this step, you’ll need to install the OBS Virtual Camera plugin. It’s easy to install, and once you do, if you restart OBS there will now be a “Start Virtual Camera” option under the Tools menu.

If you click “Start Virtual Camera”, this creates a virtual camera device recognized by Zoom. Now, log into Zoom, start a meeting, and click on the up arrow next to the “Stop Video” (or “Start Video”) button. There should now be an option to select “OBS Virtual Camera” as your video camera in Zoom. Select it, and you should see your OBS creation being streamed over Zoom!

Now comes a tricky technical issue. Suppose you want to use a clicker to flip through your slides in presentation mode. Then you need your laptop monitor to be focused on the displayed slides so that the click registers as a slide advancement. But then that means that you can’t see OBS or Zoom on your laptop screen, so you can’t see yourself as you’re gesturing to things on the virtual screen, and can’t aim appropriately or make sure your head isn’t blocking the words.

To solve this, what I did was to log into Zoom on both my laptop and iPad, and position the iPad in front of my laptop webcam so that the webcam still captured my head, but so that I could see myself on the iPad while my clicker clicked through the slides on my laptop. There are other solutions as well; it should be possible to port the iPad screen itself into Window Capture in OBS as well, though I haven’t personally figured out how to do this. But as long as you have one screen to click through your slides and another to view yourself, you’re good to go.

At this point, you’re nearly ready. Zoom will capture your audio as normal, and displays your new video setup from OBS, so you can give your talk!

Optional Step 3: Audio via OBS

There was one minor issue when I practiced this setup: OBS does some processing which makes the video feed into Zoom lag behind the audio that is also being captured by Zoom. It was only about a half-second lag, and people who I practiced with said it was noticeable but not a major issue.

I did find a way to fix the lag issue, however, and that was to pipe the audio through OBS as well, so that OBS took in both audio and video inputs from me and output both in sync to Zoom. Here are the steps I took to do so:

  1. Create an Audio Source in OBS. Go to Sources again and add an Audio Input Capture source. You can go to its Properties to set it to capture whichever microphone you prefer (in my case, the lapel mic that I described in Part III).
  2. Install a Virtual Audio Cable. This will have to be third party software, as at the moment OBS does not have a virtual audio plugin that resembles its virtual camera feature. I installed VB-Audio Cable, which is a program that can take output audio from one software source (in our case, OBS) and “plug it in” as the input audio to another software (Zoom).
  3. Turn on audio monitoring in OBS. To do so, in the Audio Mixer box at the bottom of your OBS screen, click the Settings wheel next to the volume button on Audio Input Capture, and go to Advanced Audio Properties. Then in the Audio Monitoring column, set both settings (for both Audio Input Capture and Mic/Aux) to “Monitor and Output”.
  4. Launch VB-Audio Cable. It needs to be running for the next steps to work. Besides this step, you don’t need to interact with the VB-Audio Cable program at all.
  5. Set OBS’s audio monitor to VB-Cable. Go to the main OBS menu at the top and click on Preferences. Click on the Audio tab, then scroll down to Advanced. There, set the Monitoring Device to VB-Cable.
  6. In your Zoom meeting, set your microphone to VB-Cable, which should appear as an option for your Zoom microphone now.

Now you’re all set! Time to go give that awesome virtual talk.

Please share your own tips for giving virtual talks in the comments below!

Doing mathematics in a pandemic – Part III: Teaching

This is the third post in a four-part series on adapting to the pandemic as a mathematician. See Part I – AlCoVE, Part II – Collaboration, and Part IV – Talks with OBS.

Of all the things I had to figure out how to adapt to the pandemic reality, I found teaching to be the most challenging by far. So much of the value of teaching comes from the in-person connection between students and teachers, and between peers in the classroom. How can you replicate an entire classroom experience on a 14 inch computer screen? How can you pull off hybrid teaching without diminishing the experience for those students who take the course remotely?

I taught two courses in the fall of 2020 – a small graduate-level class on advanced combinatorics topics, and a larger undergraduate class of 30 students on introductory combinatorics.

Graduate class: outdoor teaching

When, over the summer, the studies came out showing that outdoor transmission of the coronavirus was minimal, I decided to see if I could get outdoor teaching set up for my graduate class for at least the first half of term, with the plan of moving the class online once it got too cold.

There were a lot of considerations to take into account when setting up a good outdoor learning environment. What do you write on? What if it rained? How do you record the lectures outdoors, with good sound quality, to make sure everyone can still participate even if they have to quarantine due to a COVID-19 exposure? How do you make sure the students can hear you over the noise of nearby traffic and birds and other outdoor distractions? How do you ensure student comfort when taking notes, without having traditional desks?

Here were the tools I used to solve – or at least attempt to solve – each of these issues.

  • Rolling whiteboards. The CSU math department ordered lightweight rolling whiteboards with weather-resistant aluminum frames specifically for this purpose. There were at least three instructors who started the semester teaching outdoors, and we made a schedule of who would roll it out and who would roll it back in each day. They worked well outdoors, and as long as you could guarantee you’d be in a shady spot, there was no glare.
  • A good location. Behind the math building on my campus was a shady spot on the grass next to a large parking lot with very little daytime traffic going in and out. A generator nearby provided some ambient white noise that drowned out the traffic from a nearby road. Two trees provided a feel of being somewhat removed from the bustling campus sidewalk on the other side. It wasn’t too much effort to roll a whiteboard there. We really lucked out on that front – it was pretty much ideal.

    Not all campuses may have such a spot, and some universities solved this using outdoor tents.

    Here is a picture of the location and the whiteboard (from a meeting with a grad student, not from class):
  • USB lapel mic for recording. A lapel mic, also known as a lavalier mic – one that clips to your collar – is the best way to pick up only your voice and filter out other noises when recording outdoors. I got this one mainly for its 20-foot cable that allowed me to walk around freely at the board, and there are plenty of options out there like it.

    I plugged the mic into my laptop and did the recording using the macbook webcam and Quicktime. Nothing fancy, but it did the trick for helping students catch up or participate remotely.
  • Personal amplifier for sound projection. It’s hard to project your own voice sufficiently in an outdoor setting, especially with masks. So I got a personal amplifier that would help project my speech to the class. I found that tucking the microphone under my mask and turning the volume low was a good way to get the sound to amplify; it didn’t pick up the sound so well when it was on the other side of the mask.

    If teaching outdoors post-pandemic, I highly recommend it; without a mask it would be even better at getting accurate sound and projecting it to the class.
  • Weatherproof box and laptop stand. What if it started raining, and all my recording equipment and laptop got rained on? And how do you set your laptop or webcam up at the right height to record yourself writing on the whiteboard?

    I solved both of these issues with this large Husky storage box. It allowed me to carry out all my gadgets and whiteboard markers from my office at the start of each class all at once, and then it doubled as a stand to put my laptop on to record my lectures. It’s about the right height – you don’t want something too tall so it doesn’t block the students’ views. And then if it rains, you quickly throw everything back into the storage box.

    Luckily, my class was at 1 pm and located in the Colorado front range. The late summer/early fall weather patterns are very predictable, with thunderstorms and rain usually rolling in from the mountains in late afternoon, around 3 pm or later. So rain wasn’t generally an issue. There was one day that it started drizzling in class, but not enough that it wasn’t still pleasant to be outside or possible to take notes. Luckily everyone was there that day, so I put the recording devices back in the box and finished up the lecture without issue.
  • Lap desks for student comfort. My graduate class was very small, so I got a couple of cheap lap desks for anyone who wanted to use them. They turned out to be perfect; a simple solution worked in this case. Some students brought folding chairs, others opted to sit on the grass. Either way, student comfort was never a complaint.

There were two aspects of outdoor teaching that I hadn’t accounted for in my planning. One was the record-setting wildfires that hit our region of Colorado starting two days before class started. Some days, the air quality was simply too hazardous to spend a long period of time outside. On those days I sent an email in the morning and moved class online.

The other aspect didn’t really have to do with outdoor teaching per se, but was about in-person vs remote. At CSU, some classes were online and others were in person. It meant that some of my students really liked being in person outdoors, since then they could just stay there for their next outdoor math class in the same location. But others had to sprint to campus from their apartment to make it to my class, since they had an online class just before it that they needed to be at home for.

In the end, the students’ scheduling issues lined up in such a way that it made more sense to go fully remote after the first few weeks. But the outdoor teaching, for the short amount of time it happened, went fairly smoothly.

Graduate class: remote teaching

Ah, teaching from home. That peaceful, relaxed setting in which, halfway through the class, you hear your two-year-old twins running down the hall towards your office screaming “DIAPER FACE!!!” at the top of their lungs and cracking up, then throwing a double tantrum when Daddy frantically tries to drag them away from Mommy’s office.

Home distractions aside, in many ways the teaching setup became simpler once we moved the course online. My setup consisted of:

  • A dedicated Zoom classroom. Zoom now requires either a waiting room or a password, and I prefer the password so that students can go into the classroom early and chat before I get there, like in an in-person classroom environment. I put the Zoom ID everywhere on the website, the syllabus, on Canvas, etc. so that students can easily find it each time they try to log in.
  • An iPad and Apple Pencil for writing math real-time during class. See Part II for more details on these.
  • Notability. This is a note-taking app for the iPad that is worth every penny of it’s $10.00 price tag. It has a good note organization system as well as a nice selection of colors, pen sizes, stroke erasing, and continuous scrolling. I logged into Zoom on both my iPad and my laptop, and shared the screen on my iPad so that I could write the lecture on Notability as my “white board”. Then after the class, you can even share the Notability file as a “recording” of the class notes for the students to use – and for you to use in the future.
  • A Canvas course. My university uses Canvas as their online course organization system, and it’s become a good way to organize online classes too. I use the Announcement feature quite a bit, as well as putting all the homeworks and tests as Assignments with appropriately weighted grades so that Canvas would average their grades automatically. It also allowed a designated upload space so that students could upload their homework assignments as pdf files directly to Canvas, where I could use a tool called SpeedGrader to grade on the screen.
  • University storage space for video and note uploads. Videos take up a lot of space, and our course Canvas pages themselves were not large enough to hold a semester’s worth. Luckily, we had plenty of space on our campus OneDrive folders, and so I created a folder there for video uploads, and then linked to them on the Canvas page.
  • A tech support staff person, if possible. As the semester wore on, most instructors who were teaching remotely agreed on one thing: it was extremely time consuming and exhausting, far more than teaching in person.

    Eventually I realized why: it was the high volume of what I call “button-clicking” tasks. You have to find the Zoom lecture video recordings, rename them, and upload them to OneDrive. You have to wait for that upload to finish. You have to link to those videos from Canvas. Then you do the same for the Notability files. There are homeworks to upload and link to. There are homework solutions to grade in an online format that requires logging into Canvas with your university ID, which takes extra two-step verification because you’re working off campus. There are emails to respond to. Just a lot of very draining screen time spent clicking buttons in isolation.

    If the pandemic was going to be a long-term thing, I think it would make sense for universities to invest in a tech support crew, or perhaps hired undergraduates, to take this burden off of the professors. In my case, my tech support person was my husband, Bryan Gillespie. He had made the admirable decision to be the primary parent to our twin two-year-olds during the pandemic, which allowed me to keep up my momentum on the tenure track while we avoided the covid risks that came with childcare. But Bryan also has a Ph.D. in mathematics, so he was more than capable of uploading video files to a website and naming them appropriately.

    And he did so, so that I could play with my kids a little more and he could feel a little more connected to the world. Interesting times indeed.
Above: Bryan Gillespie, my personal tech support staff.

Undergraduate class: Online

My undergraduate class, Introduction to Combinatorics, was one of two sections, the other being taught by Rachel Pries. Both sections were run fully online, and Rachel and I teamed up to make a plan for how we would run the course. It consisted of both synchronous classes and asynchronous components.

The asynchronous components were assigned readings as well as 10-minute videos that we created to go along with each lecture. The idea was that the students would watch the 10 minute video before coming to class, and then in-class time could be devoted to discussions, student presentations, and problems. It worked very well for a first combinatorics class, which doesn’t require a lot of theory but does benefit from a lot of practice.

The videos were a lot of work to make, but I wanted to make a resource that I or other professors could use in the future, so I put some effort into making them high-quality. Here is one of them, which I uploaded to YouTube in order to share here on this post:

As you can see, I created the videos in a way that would most closely resemble an actual lecture format, with me standing in front of the mathematics and speaking to the audience, but without the glare and blurriness of me actually standing at a physical whiteboard and pointing a webcam at it. Here was how I created them:

  1. A green screen. I purchased a green screen for my home office. A green screen allows video software like Zoom to more crisply cut you out and put a chosen background behind you, since it allows the video processor to just filter out everything that is the shade of the green screen. So as long as you aren’t wearing green, it works very well and you don’t get the blurriness of the default Zoom background feature.

    In order to use Zoom with a green screen, open a meeting, and click on the little up arrow next to “Stop Video”. Click on “Choose virtual background”, and then check the little box at the bottom that says “I have a green screen.” Finally, upload your preferred background photo or video with the little + button in the upper right of the Choose Virtual Background window, and click on it to set it as your background.
  2. Math in background. The math you see in my background is a pre-recorded video that I upload into Zoom and set as my Zoom background as above. I recorded it using Notability on the iPad (see above), using the following steps:
    • First I write out partial notes in Notability that don’t have all the computations done, alternating whether the notes are in a column on the left half or the right half of each page, to keep space for my head on the Zoom screen.
    • I then make a screen recording on the iPad by swiping down from the upper right corner to get the iPad recording menu, and pressing the Record circle. It gives three seconds to tap back to Notability before it starts recording.
    • I speak out loud as if I’m giving the lecture to time it, and fill in the examples and computations on Notability with my Apple Pencil as I slowly scroll through the notes. The iPad only records the visual screen, not my sound, but speaking helps me make sure it’s timed well for my recording with my face and audio.
    • I finally swipe down from the upper right again to press stop on the recording, and it generates a video in the Photos app on the iPad. The video needs to be compressed to a smaller size to use as a Zoom background, so I got a free compression app called Compressor and I feed the screen recording to the Compressor app. Finally, I save the output to Dropbox so that I can access it from my computer.
  3. Zoom recording. With the background video ready, it’s time to open Zoom to record the final video. I open Zoom to a new meeting with just myself, and upload the video to the Choose Virtual Background menu as described in Step 1. Then I select my default CSU background to start, press “record” on Zoom, and after saying the intro piece, I click on the video to set it as my virtual background during the recording, and start talking about and pointing to the math as it shows up behind my head.
  4. Uploading. When you close the Zoom meeting, the folder with the recording pops up. At that point I rename the video and upload it to OneDrive.

So yes, it’s a lot of steps, but for only 10-15 minute videos, it wasn’t too bad to do the recording twice to make it look really nice and professional. Here are a few other tricks I discovered that improved the setup over time:

  • Proper height stand for laptop or webcam. I noticed I looked more natural and teacher-like if I was standing while recording rather than sitting. So I put one of the Husky storage boxes that I mentioned above (see outdoor teaching) on top of my desk and sat my laptop on top of a little box on top of that. That put the webcam at the right height, with my green screen behind me, so that my head was perfectly framed next to the notes as I stood and gestured to the mathematics.
  • Lapel mic. As I mentioned above, my house was not exactly quiet at all times with twin 2-year-olds running around. So I plugged in the lapel mic that I had gotten for outdoor recording, and went to audio settings in Zoom to choose that as my microphone. This way it only picked up my voice and not the noise around me.
  • Lighting. There is one window in my office and it’s good for illuminating the left hand side of my face during the day. I needed a light on my right to illuminate the rest, and cancel the shadow on my green screen (which can make the background less crisp). So I set up an LED lamp like this one on the right hand side of my laptop. Not the most professional lighting setup, but it worked.
  • Makeup. Bright lights and a white background can really make for a washed-out appearance. I rarely wear makeup, but I noticed that some red lipstick and blush and eyeshadow really made a difference in making me look human even with the lighting.
  • Practice. The one difficult thing about watching yourself on the screen to try to gesture to the math is that the video is flipped, opposite of what you would see in a mirror. So if you move your hand to the right, it goes left on the screen. There’s no way around this if you want to see the mathematics in the correct orientation as you’re recording, so you just have to practice the weatherman gesture technique. It was tough for the first 2 or 3 videos, but I got used to it.

    The other thing that required practice was speaking in time with the background recording as it did math and scrolled on its own behind me. Like the gesturing, you get used to it. You learn little phrases you can say to delay if it’s not scrolling as fast as you thought it would, and you learn how to wrap up a sentence quickly and segue into the next topic if it scrolls before you were expecting.

It was a lot of work, but I had fun doing it. Here’s a picture of the home green screen setup I described above:

If you have tips of your own on remote, outdoor, hybrid, or asynchronous teaching, please share them in the comments below!

Doing mathematics in a pandemic – Part II: Collaboration

This is the second post in a four-part series on adapting to the pandemic as a mathematician. See Part I – AlCoVE, Part III – Teaching, and Part IV – Talks with OBS

The first aspect of academia to be affected by the pandemic was conferences; the second was in-person collaborative projects. That research collaborator you invited to speak in your seminar can no longer visit, and the potential for a two-day intense collaboration to kick off a new project diminishes drastically. You can no longer meet in person with your graduate students, at least not as easily. Little things like deciding when you’re going to hold the fall Putnam club meetings suddenly turn from a quick conversation in the math department hallway into a five-email exchange.

So I, like all other mathematicians, found ways to adapt. I’ll share a few things that really worked, a few things that really didn’t, and a few extra tools that made things nicer. If you have tips of your own, please share them in the comments below!

Things that really worked

  • iPad with an Apple Pencil. Tablets have turned out to be an essential tool for remote research collaboration. The Apple Pencil stylus mimics writing on paper very well, and it’s great for writing shared scratchwork real-time, like you would when working alongside someone on a whiteboard or at a desk. I immediately purchased an iPad at the start of the pandemic (thanks CSU!), and I opted for the large-screen 12.9 inch size so that I had plenty of space to write mathematics and share it virtually.
  • Zoom. This almost goes without saying at this point, but it’s the best videoconferencing software I’ve tried so far. Its video and audio quality and the lack of lag really are impressive, and important for the natural flow of conversation.
  • The Zoom whiteboard. If you click “Share Screen” in a Zoom meeting, the first option is to share a whiteboard that other collaborators can write on as well. It’s a little finicky, but Zoom has been improving it, and here are some tips:
    • Saving. Always remember to click “Save” on the whiteboard annotation menu before ending the Zoom meeting. I wish Zoom did this automatically, but it unfortunately does not. It saves them as .png files, one for each page, in your Zoom folder, which should pop up after you leave the meeting.
    • Names popping up. If you see the annoying feature of someone’s Zoom name popping up by where they’re writing, click on the three dots dropdown on the annotation menu and click “Hide names of annotators”.
    • New page. You can click in the lower right corner of the whiteboard screen to make a new page.
    • Unsharing and resharing. Zoom used to have the extremely annoying feature of forgetting what was on the whiteboards when you stop sharing, so that if you share something else and then go back to sharing the whiteboard, it’s blank again. They have recently fixed this, and now it remembers the whiteboard for the duration of a Zoom meeting even after it is unshared and shared again!
  • Overleaf. Overleaf is a great tool for writing collaboratively in LaTeX, complete with an online editor with a preview window that is well-synced with the LaTeX code.
    Its file sharing system uses git for version control, so mathematicians who prefer working locally to working on the cloud can clone the git repository to a local folder. I recommend using the Overleaf project to share files, as opposed to say Dropbox (see below), so that everything is in one place.

Things that didn’t really work

  • A physical whiteboard to point your webcam at. This was the first thing I tried in order to collaborate over Zoom, before obtaining an iPad. It’s hard to set up in a way that it’s easily visible with no glare, and you end up getting back pain from hunching over so your face is in the screen sometimes as well as the whiteboard. I believe it can be done correctly if you have the right webcam and office setup though.
  • Writing something down on a piece of paper and holding it shakily up to your webcam. I admit, I’ve done it. We’ve all done it. But no. Just no.
  • Email. An email is great for setting up a Zoom meeting. Not so great for doing collaborative mathematics. It’s slow and cumbersome and a Zoom meeting is almost always better. The exception was when an email served as a way to share a quick idea before you forgot it, so that you can bring it up again at the next meeting.
  • Dropbox. In my experience, any time a Dropbox folder is set up and shared, it’s later forgotten about and then everyone has to ask each other what the Dropbox folder was called. Someone would make an extra Dropbox folder containing a single file consisting of a picture of a diagram they drew, and then after viewing it everyone forgets where the picture went. It’s also not very good for simultaneous editing of papers, in terms of version control. (See Overleaf above.)
  • Google Drive. See Dropbox.
  • Any video chat client that is not Zoom. I have heard some people saying they like Microsoft Teams, but I think it’s safe to say that avoiding Hangouts or Facetime or Facebook video chat is a good idea. The lag and connection issues alone make these alternatives very inconvenient, and they don’t have sharing or whiteboard capabilities.

Little things that are worth it

  • Paper feel screen protector for iPad. I didn’t even know this existed until the holiday season, when my husband surprised me with this. It is an iPad screen protector that makes writing with the Apple Pencil feel actually like writing on paper. After installing it, I’ve found that writing on the tablet actually is preferable to me to writing on paper, and this is coming from someone who loves paper and doubted tablets would ever truly replace them. It’s a little thing, but it made a huge difference to me.
  • Zoom chat or Google Hangouts (outside of meetings). A chat client to send quick ideas and messages, start impromptu Zoom meetings, and just say hi once in a while, is in my experience very useful, and can help avoid some of the email overload of the pandemic era.
  • Zulip, Slack, and Discord. For larger groups, an organized chat client like Zulip, Slack, or Discord can be very helpful. Threads can be sorted by topic and it is easier to follow what is going on. Zulip is my personal favorite, but I’ve had good experiences with all three.
  • Google docs/sheets. Sometimes you just need something a little simpler than Overleaf to manage tasks or jot down ideas. Google docs has pulled through for me in such situations.
  • Being kind, being silly, and having fun. There’s a real lack of human connection these days, and it’s always good to check in with collaborators to see how they’re doing, put up a funny Zoom background, or watch someone’s cat walk across their keyboard. Little things like this, for me, help to keep my job fun and worthwhile.

Doing mathematics in a pandemic – Part I: AlCoVE

I’ll be writing up a series of posts on what I’ve learned so far about adapting my work to a pandemic-compatible lifestyle. This is the first, and focuses on math conferences. Stay safe out there!
For the other posts in this series, see Part II – Collaboration, Part III – Teaching, and Part IV – Talks with OBS.

It was March 15, 2020, and suddenly everything stopped.

This story likely sounds familiar, because the same thing probably happened to you. Classes went online. Conferences were cancelled. No more chatting with colleagues at department tea. Home life suddenly became radically different and also much more central.

The world had grinded to a halt, and yet… there was one thing that began. And that was an overwhelming sense of community and solidarity, because everyone else in the world had stopped too. And it seemed to me to be an excellent opportunity to try to create something new together.

Mathematics and community

It is said that the most important aspect of conferences is not the talks, but the coffee breaks between them.

It sounds at first like a joke about how dependent mathematicians are on caffeine. But there is a real truth to it in a different sense. The coffee breaks are where connections are made, where new ideas are spawned, where the speaker meets the one person who just might have the right tools to crack that open problem that they posed on their last slide. They’re where pairs of mathematicians who find themselves in a deep conversation comparing each of their latest tableaux insertion algorithms awkwardly check their watches and schedules and both sheepishly admit that they weren’t really looking forward to the conference banquet anyway. They then grin and scurry off to an unoccupied whiteboard to make a new joint discovery.

When everything stopped, that stopped too. But did it have to, entirely?

This was a question I posed on the Facebook group for mathematicians who specialize in symmetric functions and related algebraic combinatorics (yes, there is a Facebook group for that!). I asked if anyone would want to help me organize an online conference that tried to re-create as many of those in-person networking aspects as possible. Something that could even potentially continue into the future, as flying to so many conferences all the time, while good for mathematical progress, is not really environmentally sustainable.

I got three enthusiastic responses within an hour. Laura Colmenarejo, Oliver Pechenik, and Liam Solus were on board, and we had an organizing committee!

AlCoVE: an Algebraic Combinatorics Virtual Expedition

In order to capture the essential aspects of the conference, namely that it is about algebraic combinatorics and that it aims to capture as many of the in-person advantages of conferences as possible, we named it the Algebraic Combinatorics Virtual Expedition, or AlCoVE. It didn’t hurt that alcove walks are a highly useful and modern combinatorial construction that arise in the study of Coxeter groups, symmetric functions, and geometry (see these slides by Elizabeth Millićević for excellent illustrations of alcove walks). We had a name, and we had a pun. It was a good start.

Then came the design phase. Laura, Liam, Oliver, and I met on Zoom weekly to start planning, and started by trying to answer some of the basics:

  1. What days should the conference be held?
    We initially thought of holding a weekend conference, but then we considered that with home life being more central during the pandemic, perhaps we should have it during “work hours” so as not to overlap with participants’ family/life plans. So we decided on Monday and Tuesday, June 15-16, on a week in which participants at universities with either a semester or quarter schedule would be unlikely to be teaching.

    I think it was the right choice in the end; in our post-conference feedback form, only 8 of the 71 respondents said they would have preferred a weekend conference. Another 24 were neutral, and the remaining 39 said they preferred the weekdays over weekend.

  2. How do we account for differing time zones, given that participants are going to be in many different locations around the world?
    Our solution to this was perhaps a bit biased towards the West, as our organizers were all in either America or Europe. But we planned the conference to be from 11 AM to 5 PM Eastern time, so that on the west coast of the USA it would be from 8 AM to 2 PM, and in Europe it would be an evening conference, for instance from 4 PM to 10 PM in London.

    That being said, we had participants from India, South Africa, Australia, China, New Zealand, and more. The time zone barrier just didn’t matter as much as we thought it would. And according to the feedback form, most participants were happy with the time and scheduling of the conference.

  3. How many speakers should we have and how long should each talk be?
    Zoom fatigue is real, and it’s just harder to focus when staring at a screen than sitting in a lecture hall. In light of this fact, we decided to have talks be on the short side, a total of 30 minutes each including questions. This gave us space in the schedule for 12 talks (6 per day) with plenty of breaks and exciting social events in between.

    We then came up with a list of potential speakers to invite. We were lucky to have a team four organizers with a diverse set of interests and geographical networks within algebraic combinatorics, and we tried to come up with a good balance of mathematical and geographical diversity among the speakers. While we didn’t initially consider gender diversity while creating our list, we were pleased to see that 6 out of 12 of the mathematicians we naturally thought of first were female. It was perhaps a reflection of the friendliness and diversity that already exists in the algebraic combinatorics community.

    To our delight, everyone that we invited to speak accepted our invitation. There are perhaps some advantages of organizing a conference at a time when literally everything else is cancelled.

  4. Should we have a poster session?
    This took us a long time to decide on and subsequently plan; indeed, a virtual talk is one thing, but how do you run a virtual poster session? Then again, poster sessions are a great way to give younger participants, especially graduate students, the opportunity to share their work and ideas.

    We did end up organizing a poster session, and limited the number of posters to 12 so that it would be more manageable in a virtual setting. We had a ton of excellent submissions that were very hard to choose between. The way we implemented it was by assigning one breakout room for each poster in Zoom, and then give every single participant “co-host” power in the meeting so that they can freely move between breakout rooms as if they are walking from one poster to another. (Non-co-hosts do not have this power in Zoom.)

    It went well overall. See “Conference Day 2” below for details on how the poster session went, and ideas on how to make a poster session potentially run even more smoothly at future conferences.

  5. What should “coffee breaks” or “lunch breaks” consist of, in order to optimize social and mathematical connection in a virtual environment?
    I’m glad you asked! This was by far the most fun part of planning the conference, and there were many bouts of doubled-over, tears-streaming-down-face laughter among the organizing committee during our Zoom meetings as we brainstormed potential fun ideas for conference activities. Here was what we came up with, and the surprises involved in planning each:
    • Polls. Fun, meaningless pseudo-mathematical polls, with multiple-choice questions like “What is the worst Coxeter group?” and “Do you consider yourself a combinatorist, a combinatorialist, or a combinatoricist?” were our first idea for a social event during the breaks.

      We were inspired by Zoom’s “poll” feature, but we quickly realized that using Zoom’s built-in poll system was not ideal. We wanted to split participants into breakout rooms to take the poll, so that smaller discussions of the questions could take place. But Zoom’s polls do not show up when participants are in breakout rooms. So that eliminated Zoom’s feature as an option pretty quickly.

      Instead, we used Google Forms to put together the polls. Here is one example:
      AlCoVE Poll 1
      We simply shared the link in the Zoom chat, then split participants up into breakout rooms randomly and gave them time to participate. We then called everyone back at the end to discuss the poll results, and it served as a predictably hilarious and relaxing break between talks.

    • Escape Rooms. We created one short “virtual escape room”, again in Google Forms, which has a regular expression matching feature to check answers, so that you could prevent participants from going to the next “room” (page) until they have solved the riddle in the previous “room”.

      (Tip: To enable this feature on a given question when creating a Google form, simply click on the three dots in the lower right of a question frame and click “Response Validation”. There are then options to make the answer have to match a regular expression of your choice, and return an error message if it is incorrect.)

      Here was our conference escape room. Clearly none of us were professional puzzle writers, but when Team 2 escaped their breakout room and came back into the main Zoom room before any other team, they punched the air and cheered in victory, and we knew the social event had achieved its purpose.

    • Scavenger hunt. We created a scavenger hunt, again in Google forms, that asked participants to find things in their home, such as math textbooks or conference T-shirts, to try to match or differ from their teammates in their breakout room to score the most possible points. We got some excellent pictures submitted to the scavenger hunt challenges, and it made for great “conference photos”.

    • Virtual Excursions. These were intended as true breaks from participants’ home office desks, in which participants were split into breakout rooms and encouraged to walk around their house with their phone or laptop on Zoom to show their breakout room their local surroundings and just generally stretch their legs. The aim was to re-create the aspect of conferences in which participants walk from the conference building to the banquet hall and end up walking with a little group and chatting on their way.

      It didn’t quite end up truly re-creating what we were hoping for, but it was an easy excursion to organize and was one step up from just putting participants in breakout rooms with no direction as to what would happen in that break, which can lead to awkward silences and a lot of turned-off videos.

    • Make-Your-Own-Lunch breaks. There was a half hour “lunch break” in the middle of each conference day, in which participants were again split into breakout rooms and encouraged to make and eat lunch together over Zoom. It did lead to more interaction – who doesn’t like to talk about food? – and was the only official meal we scheduled for participants to have together.

    • Happy Hour. At the end of the first day of the conference, we made everyone co-hosts (see Conference Day 1 or 2 below for some details on this process) and set up 11 breakout rooms. You can name breakout rooms manually in Zoom, and we called one room the “Lobby” and put everyone in the lobby to start out. The other rooms were called “Table 1” through “Table 10”. Since participants had co-host powers, they were able to go “sit down” at any table they chose.

      This worked very well to mimic an actual happy hour in which there are a number of tables in a large conference room and participants mingle by moving from table to table to see old friends and meet new acquaintances. The only things we couldn’t provide virtually were drinks and appetizers!

Conference Day 1: Success or disaster?

With all the talks, poster sessions, and social events planned out, it was finally time for the conference! Laura, Liam, Oliver, and I had several last-minute meetings to test everything and everything seemed to be in order.

Naturally, a major issue arose within the first half hour of the conference. As participants were signing in, we quickly realized it was capping the number of participants at 100, even though I had already bought the Zoom ability for my account to host 500 participants. Meanwhile, over 400 people had registered. The first talk was 5 minutes away, and I had no idea why Zoom was capping us at 100. What were we going to do?

We quickly sent emails directing everyone to switch to a different Zoom meeting number on Oliver’s university account, which had a 300 person capacity, and crossed our fingers and hoped that the number of participants did not exceed 300 at any given time that morning.

Luckily we capped out at about 290 during the first talk. Disaster averted!

In the meantime, I poked around Zoom and found the switch I needed to flip. Apparently even if your personal Zoom account is listed as being able to host large meetings, you are considered a “user” on your own Zoom account and you have to enable that “user” (yourself) to be able to use that power that the entire account bought. It makes no sense, but there it is.

I flipped the switch on zoom.us and we switched back to the original planned Zoom link after the lunch break. It was still glitchy; on both Oliver’s and my accounts, we had delays in the Zoom chat when people tried to post links and other information. It seemed that 200+ participants was simply getting a bit too large for Zoom to handle in one meeting, and their “large meeting” option was not entirely without issues yet.

The last thing that was awkward on the first day was the preparation for the happy hour. There is no way to assign users as co-hosts of the meeting before they log on, which means we had to manually make users co-hosts in advance of the happy hour. But there is also no button that makes everyone co-hosts at once on Zoom, so the only option is to manually make every participant a co-host one by one.

To make matters even more awkward, every time a participant is made a co-host, a little notification shows up on everyone’s screen. So the aim is to make them slowly enough that you don’t overwhelm the talk slides with notifications, but fast enough that everyone is a co-host by the time the happy hour starts so that they can all sit down at the “tables” (breakout rooms) of their choice. It was a tricky business but we got it done.

Aside from the technical issues, the first day went well. There were fantastic talks and funny polls and virtual excursions and a happy hour to cap it off at the end of the day.

Conference Day 2: Success!

While we didn’t have beginner’s luck, we did learn from Day 1, because Day 2 went much more smoothly. The conference didn’t cap our participants at 100. There were fewer participants overall and therefore fewer glitches in the chat window. The talks were incredible again, and the social activities went smoothly.

The main new challenge was the poster session. This was far harder to prepare for than the happy hour, because not only did I have to make everyone co-hosts during the talk preceding the poster session, but I had to create breakout rooms according to the posters. I created one “Lobby” room and then one room per poster, and tried to put the speaker and name of the poster as the name of the breakout room.

What I didn’t realize was that Zoom has a character limit on the breakout room names. What that meant was that I couldn’t just copy and paste the names of the presenters and titles of the posters from our website into Zoom. I had to first abbreviate and edit the titles so that they were under Zoom’s character limit, in a way that the content of the poster would still be clear to a participant browsing the titles from within Zoom. And naturally if I was editing a title in Zoom but tabbed over to glance at the title again before hitting “save”, it would delete my work and I’d have to start over. It was an unbelievable pain and I’d definitely prepare the abbreviations in advance next time.

Luckily I just barely finished the naming and assigning and co-hosting by the time the poster session was about to begin. And it began! I mostly stayed in the main room and directed lost souls who lost internet connection for a bit, but my co-organizers said that the poster session went very smoothly overall.

Video recordings and wrap-up

We recorded all the talks, and after the conference we used iMovie to do some basic processing (such as a title slide for each), and uploaded them to the new AlCoVE YouTube channel. We hope to add to this channel in future years!

Indeed, what was magical about AlCoVE is how it brought together so many mathematicians from all around the world so easily, and still re-created some of the social and networking advantages of in-person conferences. Moving more conferences online can not only drastically reduce the carbon footprint of academia, but even help with inclusivity and diversity in the community, as even those who ordinarily would not be able to travel were able to participate.

All in all, I believe AlCoVE was a very positive thing to come out of the worldwide shutdowns. I’m grateful to everyone who helped organize or speak or participate, and I hope (and will try to ensure) that it continues to run in future years, pandemic or no pandemic.

Addendum: FPSAC 2020

A few weeks after AlCoVE, I participated in FPSAC 2020 Online, the online pandemic version of an existing annual international conference called Formal Power Series in Algebraic Combinatorics (FPSAC). It was designed quite differently and also worked very well, and I learned about alternatives to Zoom breakout rooms like gather.town and Unhangout that could potentially be better for a happy hour or poster session than Zoom was. I’m excited to see where all of these recent virtual technologies lead the mathematical community in the long run.

A linear algebra-free proof of the Matrix-Tree Theorem

As a new assistant professor at Colorado State University, I had the privilege this fall of teaching Math 501, the introductory graduate level course in combinatorics. We encountered many ‘mathematical gemstones’ in the course, and one of my favorites is the Matrix-Tree theorem, which gives a determinantal formula for the number of spanning trees in a graph. In particular, there is a version for directed graphs that can be stated as follows.

Consider a directed graph $D=(V,E)$, consisting of a finite vertex set $V=\{v_1,\ldots,v_n\}$ and a set of directed edges $E\subseteq V\times V$. An oriented spanning tree of $D$ is a subset $T\subset E$ of the edges, along with a chosen root vertex $v_k$, such that there is a unique path in $T$ from any vertex $v_j\in V$ to the root $v_k$. Such a tree is said to be oriented towards $v_k$, since all the edges are `pointing towards’ the root. The term spanning indicates that $T$ is incident to every vertex in $V$. For example, in the digraph $D$ at left below, an oriented spanning tree rooted at $v_9$ is shown using red edges in the graph at right.

Define $\tau(D,v_k)$ to be the number of oriented spanning trees of $D$ rooted at $v_k$. One can check that, in the above graph, we have $\tau(D,v_9)=16$. Now, let $m_{i,j}$ be the number of directed edges from $v_i$ to $v_j$ in $D$, so that $m_{i,j}$ is equal to $1$ if $(v_i,v_j)$ is an edge and $0$ otherwise. Define the Laplacian of the digraph $D$ to be the matrix $$L(D)=\left(\begin{array}{ccccc} \mathrm{out}(v_1) & -m_{1,2} & -m_{1,3} & \cdots & -m_{1,n} \\ -m_{2,1} & \mathrm{out}(v_2) & -m_{2,3} & \cdots & -m_{2,n} \\ -m_{3,1} & -m_{3,2} & \mathrm{out}(v_3) & \cdots & -m_{3,n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ -m_{n,1} & m_{n,2} & -m_{n,3} & \cdots & \mathrm{out}(v_n) \end{array}\right)$$ where $\mathrm{out}(v_i)$ is the outdegree of $v_i$, the number of non-loop edges having starting vertex $v_i$ (that is, the number of edges from $v_i$ to a vertex other than $v_i$). Then the (directed) Matrix-Tree theorem states that $$\tau(D,v_k)=\det(L_0(D,k))$$ where $L_0(D,k)$ is the deleted Laplacian obtained by deleting the $k$th row and column from $L(D)$. For instance, in the above graph, we have $$\det L_0(D,9)=\det \left(\begin{array}{cccccccc} 2 & -2 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & -1 & -1 & 0 & 0 & 0 & 0 & 0 \\ -1 & 0 & 2 & 0 & 0 & -1 & 0 & 0 \\ 0 & -1 & 0 & 2 & -1 & 0 & 0 & 0 \\ 0 & 0 & -1 & 0 & 2 & 0 & -1 & 0 \\ 0 & 0 & 0 & 0 & -1 & 2 & 0 & -1 \\ 0 & 0 & 0 & 0 & -1 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{array}\right)=16$$ There are several known proofs of the Matrix-Tree theorem. One of the more `standard’ proofs is by induction on the number of edges in the digraph, combined with a bit of linear algebra and row reduction. But it got me thinking:

Is there be a way to prove that the determinant formula holds directly, without relying on induction or linear algebra?

In particular, the determinant of a matrix $A=(a_{ij})$ can be defined explicitly as $$\det(A)=\sum_{\pi\in S_n} \mathrm{sgn}(\pi)\prod_{i} a_{i\pi(i)}$$ where $\pi:\{1,2,\ldots,n\}\to \{1,2,\ldots,n\}$ ranges over all permutations (bijections) in the symmetric group $S_n$. For instance, \begin{align*} \det\left(\begin{array}{ccc} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{array}\right)&=a_{11}a_{22}a_{33}-a_{12}a_{21}a_{33}-a_{11}a_{23}a_{32} \\ &\phantom{=}+a_{12}a_{23}a_{31}+a_{13}a_{21}a_{32}-a_{13}a_{22}a_{31}. \end{align*} It is natural to ask whether applying this formula to the deleted Laplacian gives any combinatorial insight into why the Matrix-Tree theorem should hold. And indeed, there is a direct proof using this combinatorial definition of the determinant!

A combinatorial proof

For simplicity we set $k=n$, so that we are deleting the $n$th row and column to create the deleted Laplacian $\det(L_0(D,n))$. It is sufficient to consider this case since we can always relabel the vertices to have the deleted vertex be the $n$th.

We now give a combinatorial interpretation of each of the terms of the determinant $\det(L_0(D,n)$ as a sum over permutations of $\{1,2,\ldots,n-1\}$. The term corresponding to the identity permutation is the product of the diagonal entries of $L_0(D,n)$, which is $$\prod_{i\neq n} \mathrm{out}(v_i).$$ This counts the number of ways of choosing a non-loop edge starting at each vertex $v_i\neq v_n$; we call such a choice an out-edge subgraph $G$ of $D$. Note that all oriented spanning trees with root $v_n$ are out-edge subgraphs, but in general an out-edge subgraph may have cycles among the vertices other than $v_n$. In fact, it is not hard to see that every out-edge subgraph consists a number of nontrivial directed cycles among non-$v_n$ vertices, along with a unique directed path from every other vertex into either one of the cycles or into $v_n$. Two examples of out-edge subgraphs which are not trees are shown below.

Now, for a general term corresponding to a permutation $\pi$ of $\{1,2,\ldots,n-1\}$, consider the decomposition of $\pi$ into disjoint cycles. Suppose there are $p$ fixed points and $r$ nontrivial cycles; let $a_1,\ldots,a_p$ be the fixed points of $\pi$ and $(a_{1}^{(j)}\cdots a_{c_j}^{(j)})$ are the other cycles of lengths $c_1,\ldots,c_r$. Then the sign of $\pi$ is $$\mathrm{sgn}(\pi)=(-1)^{(c_1-1)+\cdots+(c_r-1)}=(-1)^{(n-1-p)-r}.$$ The entries multiplied together in the term corresponding to $\pi$ are the outdegrees of $v_{a_1},\ldots, v_{a_p}$ along with the values $-m_{a_{t}^{(i)},a_{t+1}^{(i)}}$. Their product is $(-1)^{n-1-p}$ times the number of ways to choose an edge from $v_{a_t^{(i)}}$ to $v_{a_{t+1}^{(i)}}$ for each $i$ and $t$. Putting this all together, the entire term of the determinant corresponding to $\pi$ is $(-1)^{r}$ times the number of subgraphs formed by choosing a cyclic path on the vertices corresponding to each nontrivial cycle in $\pi$, as well as an out edge for each fixed point. Such a choice is an out-edge subgraph that is compatible with $\pi$ in the sense that any cycle of $\pi$ corresponds to a cycle on the subgraph.

For some examples of compatibility, the permutations $(123)$, $(123)(57)$, $(57)$, and the identity are compatible with the out-edge subgraph drawn above at left. The permutations $(365)$ and the identity are compatible with the subgraph above at right.

It follows that we can rewrite the determinant as: $$\det L_0(D,n)=\sum_{(G,\pi)} (-1)^{r(\pi)}$$ where $r(\pi)$ is the number of nontrivial cycles of the $\pi$, and where the sum ranges over all pairs $(G,\pi)$ where $G$ is an out-edge subgraph and $\pi$ is a permutation compatible with $G$. (Note that the same out-edge subgraph $G$ may occur several times, paired with different permutations $\pi$.)

We finally construct a sign-reversing involution on the compatible pairs $(G,\pi)$ that cancel all the negative entries in the sum above. In particular, if $G$ has no cycles then send $(G,\pi)$ to itself, and otherwise consider the cycle $C$ in $G$ containing the vertex with the smallest label among all cycles in $G$. Define $\pi’$ by removing $C$ from $\pi$ if $\pi$ contains the cycle $C$, and otherwise adding $C$ to $\pi$ (in other words, toggle whether the elements of $C$ form a cycle or are all fixed points in the permutation). Then $\pi’$ is still compatible with $G$, so we can map $(G,\pi)$ to $(G,\pi’)$ in this case. This forms a sign-reversing involution in which the only non-canceling terms come from the pairs $$(T,\mathrm{id})$$ where $T$ is an out-edge subgraph with no cycles and $\mathrm{id}$ is the identity permutation.

Since a non-cyclic out-edge subgraph on $v_1,\ldots,v_{n-1}$ must be rooted at $v_n$ (for otherwise it would have a cycle), we can conclude that $\det L_0(D,n)$ is the number of spanning trees of $D$ rooted at $v_n$.

On Raising Your Hand

A few weeks ago I attended the AWM (Association of Women in Mathematics) Research Symposium in Houston, TX. I gave a talk in my special session, speaking on queer supercrystals for the first time, to a room full of female mathematicians.

I was a bit disappointed when, at the end of my talk, no one raised their hand to ask any questions.  It’s usually the classic sign of an uninteresting or inappropriately aimed talk, so I figured that maybe I had to revisit my slides and make them more accessible for the next time I spoke on the subject.

Afterwards, however, several of the women in my session came up to me privately to ask specific questions about my research.  When I told my husband about this after the conference, he pointed out that perhaps they just were the kind of people to prefer asking questions one-on-one rather than raising their hands during or after the lecture.

“Did anyone in your session ask questions after the other talks?” he asked me, testing his theory.

I thought about it, and was surprised when I realized the answer.  “Woah, I think you’re right,” I said.  “I asked at least one question after nearly every talk.  But I think I was the only one.  Once in a while one other woman would ask something too.  But the rest kept their hands down and went up to the speaker during the break to ask their questions.”

Upon further reflection, I realized that this was even true during the plenary talks.  During an absolutely fantastic lecture by Chelsea Walton, I was intrigued by something she said.  She mentioned that the automorphism group of the noncommutative ring $$\mathbb{C}\langle x,y\rangle/(xy-qyx)$$ is $\mathbb{C}^{\times} \times \mathbb{C}^{\times}$ for all $q\neq \pm 1$, but the answer is different at $q=1$ and $q=-1$.  I knew that many of the standard $q$-analogs arise naturally in computations in this particular ring, such as the $q$-numbers $$[n]_q=1+q+q^2+\cdots +q^{n-1}.$$ So, I wondered if the exceptions at $q=1$ and $q=-1$ were happening because $q$ was a root of unity, making some of the $q$-numbers be zero.  So maybe she was considering $q$ as a real parameter?  I raised my hand to ask.

“Is $q$ real or complex in this setting?”

“It’s complex,” Chelsea answered.  “Any nonzero complex parameter $q$.”

“Really?” I asked. “And there are no exceptions at other roots of unity?”

“Nope!” she replied with a smile, getting excited now.  “Just at $1$ and $-1$.  The roots of unity get in your way when looking at the representation theory.  But for the automorphism group, there are only two exceptional values for $q$.”  Fascinating!

No one else asked any mathematical questions during or after that talk.


Now, I have the utmost faith in womankind.  And I would normally have chalked the lack of questions and outspokenness up to it being a less mathematically cohesive conference than most, because the participants were selected from only a small percentage of mathematicians (those that happened to be female).  But it reminded me of another time, several years ago, that I had been surprised to discover the same phenomenon among a group of women in mathematics.

One summer I was visiting the Duluth REU, a fantastic research program for undergraduates run by Joe Gallian in the beautiful and remote city of Duluth, Minnesota.  As a former student at the program myself, I visited for a couple of weeks to hang out and talk math with the students.  I attended all the weekly student talks, and as usual, participated heavily, raising my hand to ask questions and give suggestions.

The day before I left, Joe took me aside.  “I wanted to thank you for visiting,” he said.  “Before you came, the women never raised their hand during the other students’ talks.  But after they saw you doing it, suddenly all of them are participating and raising their hands!”

I was floored.  I didn’t know that being a woman had anything to do with asking questions.

I have always felt a little out of place at AWM meetings.  They are inevitably host to many conversations about the struggles faced by women in competitive male-dominant settings, which I have never really related to on a personal level.  I love the hyper-competitive setting of academia.  I live for competition; I thrive in it.  And it never occurs to me to hold back from raising my hand, especially when I’m genuinely curious about why $q$ can be a complex root of unity without breaking the computation.

But, clearly, many women are in the habit of holding back, staying in the shadows, asking their questions in a one-on-one setting and not drawing attention to themselves.  And I wonder how much this phenomenon plays a role in the gender imbalance and bias in mathematics.


At the reception before the dinner at the AWM conference, I spotted Chelsea.  She was, unsurprisingly, quite popular, constantly engaged in conversation with several people at once.  I eventually made my way into a conversation in a group setting with her in it, and I introduced myself.

“Hi, I just wanted to say I really enjoyed your talk!  I was the one asking you whether $q$ was real.”

Her expression suddenly shifted from ‘oh-no-not-another-random-person-I-have-to-meet’ to a warm, smiling face of recognition.  “Oh!  I liked your question!” she exclaimed.  The conversation immediately turned to math, and she was nice enough to walk me through enough computations to convince me that $q=\pm 1$ were special cases in computing the automorphism group of the noncommutative ring.  (See Page 2 of this post for the full computation!)

The entire experience got me thinking.  It was because I raised my hand that Chelsea recognized me, that she was happy to talk to me and mathematics was communicated.  It was because I raised my hand that I got the question out in the open so that other participants could think about it as well.  It was because I raised my hand that women were doing mathematics together.  And perhaps it is because I raise my hand that I have no problem interacting in a male-dominant environment.  After all, they raise their hands all the time.

It is tempting to want to ask the men in mathematics to take a step back and let the women have the limelight once in a while.  But I don’t think that’s the answer in this case.  Men should keep raising their hands.  It’s part of how mathematics gets done.  It helps to communicate ideas more efficiently, to the whole room at once rather than only in private one-on-one settings.  It draws visibility to the interesting aspects of a talk that other participants may not have thought of.

What we really need is for women to come out of the shadows.  So, to my fellow women in mathematics: I’m calling on all of us to ask all our questions, to engage with the seminar room, to not hold back in those immensely valuable times when we are confused.  And raise our hands!

Addicted to Crystal Math

Some of my recent research has centered on crystals, so here’s an introductory post, from a combinatorial point of view, on what crystals are and where they come up.

First, a clarification: the crystals I have in mind are not the chemical kind that you find in nature, nor are they any of these other uses of the word crystal on the Wikipedia disambiguation page. There is yet another use of the word, sometimes referred to as a crystal base, introduced by Kashiwara in the mid-90’s. Kashiwara developed crystal base theory as a way of understanding the $q\to 0$ limit of the representation theory of quantum groups $U_q(\mathfrak{g})$ where $\mathfrak{g}$ is a Lie algebra and $U_q(\mathfrak{g})$ is a $q$-analog of the universal enveloping algebra $U(\mathfrak{g})$. In the quantized statistical mechanics models in which quantum groups come up, the parameter $q$ is a measure of temperature, so the $q\to 0$ limit is exploring what happens in the situation of absolute zero temperature. Hence the word “crystal”, referring to what happens to many forms of matter when frozen.

Although the original motivation for crystals came from quantum groups, the combinatorial structure is also very useful in symmetric function theory. One example of a crystal is a certain weighted, edge-labeled directed graph structure on the summands that comprise a Schur polynomial. Recall that the Schur function $s_{\lambda/\mu}(x_1,\ldots,x_n)$ is given by the sum $$s_{\lambda/\mu}=\sum_{T\in \mathrm{SSYT}(\lambda/\mu)} x^T$$ where $\mathrm{SSYT}(T)$ is the set of semistandard Young tableaux of skew shape $\lambda/\mu$, and $x^T=x_1^{m_1}x_2^{m_2}\cdots$ where $m_i$ is the number of $i$’s in $T$. For instance, we have $$s_{(2,1)}(x_1,x_2,x_3)=x_1^2x_2+x_1x_2^2+x_1^2x_3+2x_1x_2x_3+x_1x_3^2+x_2^2x_3+x_2x_3^2,$$ since the eight semistandard Young tableaux of straight shape $(2,1)$ are:

We can organize these tableaux by weight in a 2-dimensional grid, since each weight is a tuple $(m_1,m_2,m_3)$ where $m_1+m_2+m_3=3$. We angle the grid so that either changing the weight by $(-1,1,0)$ or by $(0,-1,1)$ is “lowering” the weight, going downwards at a 45 degree angle to the right or left. We will then connect some of the tableaux along the edges of the grid with edges labeled by weight-lowering operators $F_1$ and $F_2$ (with inverses $E_1$ and $E_2$), shown in red and blue below:

In general $F_i$ changes the weight by $(0,0,\ldots,-1,1,0,\ldots,0)$ with the $-1$ in the $i$-th position, and we think of $F_i$ as an operator on tableaux that changes an $i$ into an $i+1$. To define $F_i(T)$, we consider the reading word $w$ of the tableau $T$, formed by concatenating the rows from bottom to top, and consider just the $i$ and $i+1$ entries of $w$. Replace each $i$ with a closed bracket ‘)’, and each $i+1$ with an open bracket ‘(‘, and cancel the brackets in matching pairs until only a word of the form $$)))\cdots ))(((\cdots (($$ (or alternatively, of the form $ii\cdots ii(i+1)(i+1)\cdots(i+1)$) remains. Then $F_i$ changes the last $i$ in this subsequence to $i+1$. For example, if $T$ looks like:

then to apply $F_1$ to $T$, we consider the reading word of $1$s and $2$s and cancel matching pairs as follows:

\begin{array}{ccccccccccc}
2 & 2 & 1 & 1 & 1 & 1 & 2 & 1 & 2 & 2 & 1 \\
( & ( & ) & ) & ) & ) & ( & ) & ( & ( & ) \\
( & & & ) & ) & ) & & & ( & & \\
& & & & ) & ) & & & ( & &
\end{array}

Then $F_1$ changes the rightmost $)$ that was not cancelled to $($, changing that $1$ to $2$. The word becomes:

\begin{array}{ccccccccccc}
2 & 2 & 1 & 1 & 1 & 2 & 2 & 1 & 2 & 2 & 1.
\end{array}

Note that $F_i$ may not be defined, if there is no closed bracket that is not cancelled. If no $F_i$ operator may be applied to a tableau $T$, we say that $T$ is lowest weight. The bottom tableau on the crystal diagram above is lowest weight for the operators $F_1,F_2$. Similarly, we can define raising operators $E_i$ as the natural partial inverses of the $F_i$ operators (simply by reversing the arrows in the crystal graph), and we define highest weight elements as those for which all $E_i$ operators are undefined.

Now here’s the interesting thing: highest weight tableaux or words are precisely the Littlewood-Richardson tableaux that we have encountered in several previous posts. Recall that a tableau is Littlewood-Richardson if its reading word $w=w_1\cdots w_n$ is ballot, meaning that every tail of the word (of the form $w_k\cdots w_n$ for some $k$) has at least as many $1$’s as $2$’s, at least as many $2$’s as $3$’s, and so on. In other words, when reading the word from right to left, the smaller letters always occur more frequently than larger letters. For instance, the word $$12122111$$ is ballot, and it is not hard to see that ballotness in a word with just $1$’s and $2$’s is equivalent to the condition that every $2$ is bracketed with a $1$, so that $E_1$ is undefined.

The key fact about tableau crystals is as follows:

Every connected component of a crystal graph has a unique highest weight element.

For instance, here is the crystal graph for the skew shape $(3,1)/(1)$:

Note that each connected component has a Littlewood-Richardson tableau at the top, and no other elements are killed by all raising operators. Notice also that the first graph above has the same structure as that of the shape $(2,1)$ crystal drawn above. This is no coincidence: another key fact about the operators $F_i$ and $E_i$ is that they commute with Jeu de Taquin rectification, which is easily proven using Knuth equivalence. The tableaux in the shape $(2,1)$ crystal are simply the Jeu de Taquin rectifications of those in the corresponding component of the $(3,1)/(1)$ graph. Similarly, the tableaux in the second component rectify to form the tableau crystal for shape $(3)$. It follows that, if we sum the monomial weights in the graph, we obtain the equation $$s_{(3,1)/(1)}=s_{(2,1)}+s_{(3)},$$ which is an instance of the famous Littlewood-Richardson rule.

In general, let $c^{\lambda}_{\mu\nu}$ be the number of Littlewood-Richardson tableaux of shape $\lambda/\mu$ and content $\nu$. Then recall that the Littlewood-Richardson rule for multiplying Schur functions states:

$$s_\mu s_\nu = \sum_{\lambda} c^{\lambda}_{\mu\nu} s_\lambda,$$

or equivalently, after a bit of symmetric function theory manipulation,

$$s_{\lambda/\mu}=\sum_{\nu} c^{\lambda}_{\mu\nu} s_\nu.$$

The latter equation now follows immediately from the facts about crystals listed above, as we saw for the shape $(3,1)/(1)$. We can also prove the product version directly using crystals, by introducing a tensor product. There is a natural tensor product on crystals coming from their correspondence with Lie algebra representations, which we will define in general on the next page. In combinatorial terms, we can define the tensor product of crystals of tableaux by defining the concatenation of two tableaux $T$ and $S$ to be the tableau $T\cdot S$ formed by drawing $S$ above and to the right of $T$, with the lower left corner of $S$ matching the upper right corner of $T$. For instance, tableaux of skew shape $(3,1)/(1)$ can be formed by concatenating a tableau of shape $(1)$ with a tableau of shape $(2)$. (Note that this concatenates the reading words of $T$ and $S$ as well, and induces the plactic monoid structure on Knuth equivalence classes of words.)

Then the tensor product $\mathcal{B}_\mu \otimes \mathcal{B}_\nu$ of two tableaux crystals on shapes $\mu$ and $\nu$ is the crystal on all words of the form $T\cdot S$ for $T\in \mathcal{B}_\mu$ and $S\in \mathcal{B}_\lambda$, with operators $F_i$ connecting the nodes appropriately. It turns out that every connected component in the resulting crystal are again full crystals of tableaux, with Schur functions as their weight generating functions. We thus obtain a crystal-theoretic interpretation of the multiplicative Littlewood-Richardson rule. Namely, we have
$$\mathcal{B}_\mu \otimes \mathcal{B}_\nu=\bigcup c^\lambda_{\mu\nu} \mathcal{B}_{\lambda}.$$

For instance, $\mathcal{B}_{(1)}\otimes \mathcal{B}_{(2)}$ is the crystal whose graph is precisely the diagram of $\mathcal{B}_{(3,1)/(1)}$ shown above, so $$\mathcal{B}_{(1)}\otimes \mathcal{B}_{(2)}=\mathcal{B}_{(2,1)}\cup \mathcal{B}_{(3)},$$ matching the Schur expansion $s_{(1)}s_{(2)}=s_{(2,1)}+s_{(3)}.$

With these examples in mind, on the next page we will define the tensor category of crystals over any semisimple Lie algebra.

Hockey sticks on planet ABBABA

Question: On the planet ABBABA, the inhabitants have a binary language where the only two letters in their alphabet are A and B. The language is incredibly efficient and complex in that every finite sequence of A’s and B’s is a valid word. How many of the words in this language have exactly five A’s and at most five B’s?

For instance, ABAAABA, AAAAA, and BBBBBAAAAA are all valid such words, since they all have five A’s and no more than five B’s.

For the sake of readers who might want to grapple with this problem on their own, I’ll leave the answer and discussion to the next page. We’ll start with a straightforward counting method, and then go on to find a much faster method that will also lead us to an elegant combinatorial proof of a well-known binomial coefficient identity. Once you think you have the answer, turn to page 2!

The CW complex structure of the Grassmannian

In a previous post, we briefly described the complex Grassmannian $\mathrm{Gr}(n,k)$ as a CW complex whose cells are the Schubert cells with respect to a chosen flag. We’ll now take a closer look at the details of this construction, along the lines of the exposition in this master’s thesis of Tuomas Tajakka (chapter 3) or Hatcher’s book Vector Bundles and $K$-theory (page 31), but with the aid of concrete examples.

Background on CW complexes

Let’s first review the notion of a CW complex (or cell complex), as described in Hatcher’s Algebraic Topology.

An $n$-cell is any topological space homeomorphic to the open ball $B_n=\{v\in\mathbb{R}^n:|v|<1\}$ in $\mathbb{R}^n$. Similarly an $n$-disk is a copy of the closure $\overline{B_n}=\{v\in \mathbb{R}^n:|v|\le 1\}$.

To construct a cell complex, one starts with a set of points called the $0$-skeleton $X^0$, and then attaches $1$-disks $D$ via continuous boundary maps from the boundary $\partial D$ (which simply consists of two points) to $X^0$. The result is a $1$-skeleton $X^1$, which essentially looks like a graph:

This an then be extended to a $2$-skeleton by attaching $2$-disks $D$ via maps from the boundary $\partial D$ (which is a circle) to $X^1$. Note that, as in the picture below, the attaching map may collapse the entire boundary to a point, making the circle into a balloon-like shape.

In general the $n$-skeleton $X^n$ is formed from $X^{n-1}$ by attaching a set of $n$-disks by maps from their boundaries to $X^{n-1}$.

More precisely, to form $X^n$ from $X^{n-1}$, we start with a collection of $n$-disks $D^n_\alpha$ and continuous attaching maps $\varphi_\alpha:\partial D_\alpha^n\to X^{n-1}$. Then $$X^n=\frac{X^{n-1}\sqcup \bigsqcup_\alpha D_\alpha^n}{\sim}$$ where $\sim$ is the identification $x\sim \varphi_\alpha(x)$ for $x\in \partial D^n_\alpha$. The cell complex is $X=\bigcup_n X^n$, which may be simply $X=X^n$ if the process stops at stage $n$.

By the construction, the points $X^0$ along with the open $i$-cells associated to the $i$-disks in $X^i$ for each $i$ are disjoint and cover the cell complex $X$. The topology on $X$ is given by the rule that $A\subset X$ is open if and only if $A\cap X^n$ is open in $X^n$ for all $n$.

The real projective plane

As an example, consider the real projective plane $\mathbb{P}^2_{\mathbb{R}}$. It has a cell complex structure in which $X^0=\{(0:0:1)\}$ is a single point, and $X^1=X^0\sqcup \{(0:1:\ast)\}$ is topologically a circle formed by attaching a $1$-cell to the point at both ends.

Then, $X^2$ is formed by attaching a $2$-cell $\mathbb{R}^2$ to the circle such that the boundary wraps around the $1$-cell twice. Intuitively, this is because the points of the form $(1:xt:yt)$ as $t\to \infty$ and as $t\to -\infty$ both approach the same point in $X^1$, so the boundary map must be a $2$-to-$1$ mapping.

To make this analysis more rigorous, consider $\mathbb{P}^2_{\mathbb{R}}$ as the quotient $(\mathbb{R}^3\setminus\{(0,0,0)\})/\sim$ where $\sim$ is the equivalence relation given by scalar multiplication. In particular there is a natural continuous quotient map $$q:\mathbb{R}^3\setminus \{0\}\to \mathbb{P}^2_{\mathbb{R}}.$$ We can now take $n$-disks in $\mathbb{R}^3\setminus\{0\}$ and map them into $\mathbb{P}^2_{\mathbb{R}}$ via $q$, which will give the CW complex structure described above.

In particular, the point $(0,0,1)$ maps to $(0:0:1)$, giving the $0$-skeleton. Then consider the set of points $$\{(0,a,b):a^2+b^2=1, a\ge 0\}\subseteq \mathbb{R}^3\setminus \{0\},$$ which is a half-circle and thus homeomorphic to a $1$-disk. The quotient map $q$ gives an attaching map of the boundary of this $1$-disk to $(0:0:1)$, by mapping both endpoints $(0,0,1)$ and $(0,0,-1)$ to $(0:0:1)$. It also maps the interior of this disk bijectively to the $1$-cell $(0:1:\ast)$ in $\mathbb{P}^2_{\mathbb{R}}$.

Finally, consider the set $$D=\{(a,b,c):a^2+b^2+c^2=1,a\ge 0\},$$ which is a half-sphere and thus homeomorphic to a $2$-disk. Then under $q$, the interior of $D$ maps to the $2$-cell $(1:\ast,\ast)$, and $q$ gives a $2$-to-$1$ attaching map of the boundary $\partial D=\{(0,b,c): b^2+c^2=1\}$ onto the $1$-skeleton $\{(0:x:y)\}$. Indeed, given a desired ratio $(0:x:y)$, there are two points $(0,b,c)\in \partial D$ mapping to it, since a line with a given slope intersects the unit circle in exactly two points.

The complex projective plane

A nearly identical construction to the one above can help us realize the complex projective plane $\mathbb{P}_{\mathbb{C}}^2$ as a CW complex as well, though the structure is somewhat different.

In particular, we can again start with the point $(0:0:1)$ as our $0$-skeleton, and attach the $1$-cell $(0:1:\ast)$ as follows. Let $$q:\mathbb{C}^3\setminus\{0\}\to \mathbb{P}_{\mathbb{C}}^2$$ be the natural quotient map, and define $$D=\{(0,b,c):|(0,b,c)|=1, b\in \mathbb{R}_{\ge 0}\}\subseteq \mathbb{C}^3\setminus \{0\}.$$ Then we can write $c=x+yi$ and the conditions defining $D$ are equivalent to $b,x,y\in \mathbb{R}$, $b\ge 0$, $b^2+x^2+y^2=1$ (since $|c|^2=c\overline{c}=x^2+y^2$). Thus $D$ is a real half-sphere and hence a $2$-disk. The quotient map takes the interior of $D$, on which $b>0$, bijectively and continuously to $(0:1:\ast)$, while collapsing its entire circular boundary to the $0$-skeleton as the attaching map.

Finally, we can similarly take the image of $$E=\{(a,b,c): |(a,b,c)|=1, a\in \mathbb{R}_{\ge 0}\},$$ which is homeomorphic to a 4-disk, to obtain the remaining points in $\mathbb{P}_{\mathbb{C}}^2$. (Indeed, setting $b=x+yi$ and $c=z+wi$ for real numbers $x,y,z,w$, the space $E$ is cut out by the equations $a^2+x^2+y^2+z^2+w^2=1$, $a\ge 0$ in $5$-dimensional real space, so it is a $4$-hemisphere.) The attaching map on the boundary is “circle-to-one”: every point in the $1$-skeleton has a circle as its preimage in $E$. (Can you prove this?)

Note that these constructions can be continued for higher dimensional projective spaces over $\mathbb{C}$ or $\mathbb{R}$.

The Grassmannian

We now can analyze the complex Grassmannian $\mathrm{Gr}(n,k)$ – the space of $k$-dimensional subspaces of $\mathbb{C}^n$ – in a similar manner to that of a complex projective space. In fact, $\mathbb{P}_{\mathbb{C}}^n=\mathrm{Gr}(n+1,1)$, so we have already done this for some Grassmannians.

For $\mathrm{Gr}(n,k)$ with $k\ge 2$, we use a similar quotient map, but this time from the Stiefel manifold $V_k=V_k(\mathbb{C}^n)$, the space of orthonormal $k$-frames. That is, $V_k$ is the space of all tuples of linearly independent vectors $(v_1,\ldots,v_k)\in (\mathbb{C}^n)^k$ for which $$\langle v_i, v_j\rangle=\delta_{ij}$$ for all $i$ and $j$. (Here $\langle-,-\rangle$ can be taken as the standard Hermitian inner form $\langle (z_1,\ldots,z_n), (w_1,\ldots,w_n) \rangle=\sum z_i \overline{w_i}$.)

Now, there is a natural quotient map $q:V_k\to \mathrm{Gr}(n,k)$ given by row-reducing the matrix formed by the $k$-frame, as we did in this post. Note that $V_1=\mathbb{C}^n\setminus\{0\}$ is the space we used above for projective space, and that $q$ matches our map above in this case.

We will not go into full details of the CW complex construction here, since Tajakka’s paper does it so nicely already. Instead, let’s outline the construction for one of the nontrivial attaching maps in $\mathrm{Gr}(4,2)$.

Recall that in $\mathrm{Gr}(4,2)$, the Schubert varieties are indexed by partitions whose Young diagram fits inside a $2\times 2$ rectangle. The Schubert variety $\Omega_{(2,2)}$ (with respect to the standard flag) consists of the single point of the Grassmannian represented by the row-reduced matrix

$$\left[\begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{array}\right]$$

Use this point as the $0$-skeleton of the Grassmannian. For the $2$-skeleton, we attach the $2$-cell given by $$\Omega_{(2,1)}^{\circ}=\left[\begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 1 & \ast & 0 \end{array}\right]$$ to the $0$-skeleton, which is the unique boundary point in its closure. So the $1$-skeleton is actually homeomorphic to the complex projective line.

We now construct the attaching maps that show that if we include the two $4$-dimensional (over $\mathbb{R}$) Schubert cells given by $\Omega_{(1,1)}^{\circ}$ and $\Omega_{(2)}^{\circ}$, we get a $4$-dimensional CW complex. Consider the cell $$\Omega_{(1,1)}=\left[\begin{array}{cccc} 0 & 0 & 1 & \ast \\ 0 & 1 & 0 & \ast \end{array}\right].$$ Define $E$ to be the following subspace of $V_2(\mathbb{C}^4)$:
$$D=\{(v_1,v_2)\in V_2: v_1=(0,0,a,b), v_2=(0,c,d,e)\text{ for some }b,d,e\in \mathbb{C}, a,c\in \mathbb{R}_{\ge 0}\}.$$ Then $D$ maps under $q$ to $\Omega_{(1,1)}=\overline{\Omega_{(1,1)}^\circ}$, its interior mapping to $\Omega_{(1,1)}^\circ$.

We now wish to show that $D$ is topologically a $4$-disk, making $q$ an attaching map. This is the tricky part in general, but we’ll describe the proof explicitly in this special case.

Let $$D_1=\{(0,0,a,b):a\ge 0, b\in \mathbb{C}, a^2+b\overline{b}=1\}$$ and
$$D_2=\{(0,x,0,z):x\ge 0, z\in \mathbb{C}, x^2+z\overline{z}=1\}.$$ Then both are hemispheres in $\mathbb{R}^3$ and hence homeomorphic to $2$-disks, so their Cartesian product $D_1\times D_2$ is homeomorphic to a $4$-disk. So it suffices to show that $D_1\times D_2$ is homeomorphic to $D$.

In particular, given a pair $(v,u)\in D_1\times D_2$, we can apply a certain rotation $T_v$ to $u$ (depending continuously on $v$) that transforms $(v,u)$ into an independent, orthonormal pair of vectors $(v,T_v(u))\in D$. We can define $T_v$ to be the unique rotation that takes $(0,0,1,0)$ to $v$ and fixes all vectors orthogonal to both $(0,0,1,0)$ and $v$. Then since $(0,0,1,0)$ is already orthogonal to $u$ by the definition of $D_2$, this will give a pair $(v,T_v(u))$ of orthonormal vectors, which are independent since $u$ and $v$ are. Tajakka proves (in much more generality) that this map gives the required homeomorphism.

Back to $\mathbb{R}$eality

The real Grassmannian also has a CW complex structure, given by an almost identical construction to the one above (see Hatcher, page 31). Let’s analyze the map described above, replacing all complex coordinates with real ones.

Let $v=(0,0,a,b)$ with $a^2+b^2=1$ and $a\ge 0$ as above. The map $T_v$, in coordinates, sends $(0,x,0,w)$ to $(0,x,-wb,wa)$. Keeping in mind that $a^2+b^2=1$ and $x^2+w^2=1$, it is easy to see that the vectors $(0,0,a,b)$ and $(0,x,-wb,wa)$ form an orthonormal $2$-frame. The disk $D$ in this setting is the set of all orthonormal $2$-frames of this form with $a,x\ge 0$.

Taking the image of $D$ under the quotient map $q:V_2(\mathbb{R}^4)\to \mathrm{Gr}_{\mathbb{R}}(4,2)$, it’s not hard to show that the restriction to the boundary is a $2$-to-$1$ map, as in the construction of the real projective plane. Indeed, if $q((0,0,a,b),(0,x,0,w))$ is the point of the Grassmannian represented by $$\left[\begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 1 & c & 0 \end{array}\right],$$ this forces $a=0$, $b=\pm 1$, $x=1$, $w=-c/b$. So there are two preimages of this point, determined by the two possible values for $b$.

I haven’t worked out (or found in a reference) what the degrees of the attaching maps are for the higher dimensional Schubert cells. If you work out a higher dimensional example, or know of a reference for this, please post it in the comments below!

Thanks to Jake Levinson for enlightening mathematical discussions that motivated me to write this post.

Higher specht polynomials

There are polynomials. There are Specht polynomials. And then there are higher Specht polynomials.

A colleague recently pointed out to me the results of this paper by Ariki, Terasoma, and Yamada, or more concisely summarized here. The authors give a basis of the ring of coinvariants $$R_n=\mathbb{C}[x_1,\ldots,x_n]/(e_1,\ldots,e_n)$$ which, unlike the other bases we’ve discussed on this blog so far, respects the decomposition into irreducible $S_n$-modules. These basis elements include the ordinary Specht polynomials, but also some polynomials of higher degrees, hence the name “higher” Specht polynomials.

Background: What is a Specht polynomial?

The construction of the irreducible representations of the symmetric group $S_n$ (called Specht modules) is often described in terms of “polytabloids”, but can be equivalently described in terms of a basis of Specht polynomials.

Before we describe this basis, recall that the dimension of the irreducible representation $V_\lambda$ of $S_n$ indexed by the partition $\lambda$ is equal to the number $f^\lambda$ of Standard Young Tableaux (SYT) of shape $\lambda$, i.e., the number of fillings of the Young diagram of $\lambda$ with the numbers $1,\ldots,n$ in such a way that the entries are increasing down rows and across columns. An example is shown below for $n=7$ and $\lambda=(3,3,1)$:

Given a standard Young tableau $T$, the Specht polynomial $F_T$ is defined as the following product taken over all of its columns $C$
$$F_T(x_1,\ldots,x_n)=\prod_{C} \prod_{i\lt j\in C}(x_j-x_i)$$

For instance, in the above tableau, $$F_T(x_1,\ldots,x_7)=(x_3-x_1)(x_4-x_1)(x_4-x_3)(x_6-x_2)(x_7-x_5).$$

In other words, the Specht polynomial for $T$ is the product of the Vandermonde determinants given by each of its columns’ entries. It is known that the set of polymomials $F_T$, where $T$ ranges over all SYT’s of shape $\lambda$, form a basis of their span, and their span is isomorphic to $V_\lambda$ as an $S_n$-module (here the $S_n$-action is the usual action on the variables $x_1,\ldots,x_n$).

Specht polynomials in the coinvariant ring

As described in detail in this post, the coinvariant ring is the $S_n$-module $$R_n=\mathbb{C}[x_1,\ldots,x_n]/(e_1,\ldots,e_n)$$ where $e_i$ is the $i$th elementary symmetric function in the variables $x_1,\ldots,x_n$.

The ring $R_n$ inherits the action of $S_n$ on the polynomial ring, and we saw in the previous post on the coinvariant ring that it is isomorphic to the regular representation as an $S_n$-module. It is also graded by polynomial degree, with Hilbert series equal to $(n)_q!$. There are many known bases of $n!$ elements that respect this grading, but the ones we have described in previous posts do not respect the $S_n$-action. In particular, can we find an explicit basis which partitions into bases of the irreducible components in $R_n$?

To start, the Specht polynomials $F_T$ defined above are nonzero in $R_n$ and span one copy of each Specht module. But we know that in the regular representation, there are $f^\lambda$ copies of the irreducible representation $V_\lambda$ for each $\lambda$, and this construction only gives one copy of each. In fact, since the copies of $V_\lambda$ must be alternating in the column variables, they are the lowest-degree copies of each. Hence, we need higher Specht polynomials.

To define them, we first need to understand the RSK correspondence and the charge statistic.

The tools: RSK and charge

Somehow, in all of my years of blogging, I have never actually described the RSK (Robinson-Schensted-Knuth) correspondence. By the nature of the regular representation of $S_n$, we must have $$\sum_{\lambda \vdash n} (f^{\lambda})^2=n!$$ since each irreducible representation $V_\lambda$ of dimension $f^\lambda$ occurs exactly $f^{\lambda}$ times, and the regular representation has dimension $n!$. Recall also that $f^\lambda$ is the number of standard Young tableaux of shape $\lambda$.

The RSK correspondence gives a combinatorially described bijection between pairs of standard Young tableaux of the same partition shape $\lambda$ of $n$ (for some $\lambda$) and permutations of $n$. It’s a beautiful bijection, but since the readers of this post will most likely already be familiar with it, I will simply refer the reader to Simon Rubenstein-Salzedo’s clearly-written introduction to RSK.

The upshot is that there is one higher Specht polynomial for each pair of standard Young tableaux $(S,T)$ of the same shape, denoted $F_T^S$. The degree of $F_T^S$ is given by the charge statistic on $S$, defined as follows. Let $w(S)$ be the reading word of $S$, formed by concatenating the rows of $S$ from bottom to top, and assign subscripts $s(i)$ to the letters $i$ of the reading word recursively as follows. Assign the letter $1$ the subscript $s(1)=0$, and for each $i\ge 1$, if $i+1$ is to the right of $i$ then assign it $s(i+1)=s(i)$, and otherwise $s(i+1)=s(i)+1$. The charge of $S$ is the sum of the subscripts $s(i)$.

For example, let $S$ and $T$ be the tableaux given below:

Then the charge of $S$ is computed by first forming the reading word $$w(S)=3267145$$ and then placing subscripts starting from the letter $1$ according to the rules above:
$$3_2 2_1 6_3 7_3 1_0 4_2 5_2$$
Then the charge of $S$ is $2+1+3+3+0+2+2=13$.

The lowest-degree (minimal charge) case occurs when $S=S_0$ is the canonical choice having elements $1,2,3,\ldots,n$ written in order left to right in each row from top to bottom, as in:

In this case we will recover the standard Specht polynomials, and the others are the higher Specht polynomials.

Construction of the Higher Specht polynomials

We now have the tools to describe the construction. Given a pair $(S,T)$ of standard Young tableaux of the same shape $\lambda$, let $x_T^{\mathrm{charge}(S)}$ be the monomial $x_1^{i_1}\cdots x_n^{i_n}$ where $i_k$ is the charge subscript on the cell in $S$ corresponding to the cell containing $k$ in $T$. For example, if we fill the boxes of $S$ in the example above with their corresponding subscripts in the charge computation, it looks like:

and we have $x_T^{\mathrm{charge}(S)}=x_2^2x_3x_4^2x_5^2x_6^3x_7^3$.

Then the higher Specht polynomial $F_T^S$ is defined as $$F_T^S(x_1,\ldots,x_n)=\sum_{\tau\in \mathrm{Col}(T)}\sum_{\sigma\in \mathrm{Row}(T)} \mathrm{sgn}(\tau)\tau\sigma x_T^{\mathrm{charge}(S)}$$ where $\mathrm{Row}(T)$ and $\mathrm{Col}(T)$ are the subgroups of $S_n$ generated by the permutations of the rows of $T$ and the columns of $T$, respectively.

Notice that this polynomial has degree given by the charge of $S$, which is minimal when $S=S_0$ is the canonical tableau of shape $\lambda$ as described above. And when $S=S_0$, $x_T^{\mathrm{charge}(S)}$ is the product $\prod_{i} x_i^{h_i}$ where $h_i$ is the row of the entry $i$ in $T$. It’s not hard to see that antisymmetrizing this monomial over the column permutations recovers the ordinary Specht polynomial $F_T$. Furthermore, symmetrizing across rows keeps the monomial fixed, and so the higher Specht polynomial $F_T^{S_0}$ is simply a constant multiple of the ordinary Specht polynomial $F_T$.

In general the Specht module $V_\lambda$ occurs once for each tableau $S$ of shape $\lambda$, giving the right multiplicities for each irreducible representation.

Thanks to some Sage code by Nicolas Thiery, we can use this formula to efficiently compute the higher Specht polynomials. Here are the polynomials for $n=3$:

\[
6 \\
2(y-x) \\
2(z-x) \\
z(y-x) \\
y(z-x) \\
(y-x)(z-x)(z-y)
\]

Note that the first three and the last are (scalar multiples of) ordinary Specht polynomials, and $z(y-x)$ and $y(z-x)$ are two higher Specht polynomials giving another copy of $V_{(2,1)}$.

Generalizations

In their paper, Ariki, Terasoma, and Yamada actually construct a more general version of the higher Specht polynomials that respects the restricted $S_\mu$-module structure of $R_n$ where $S_{\mu_1}\times \cdots \times S_{\mu_k}\subseteq S_n$ is any Young subgroup. It’s also natural to ask whether there are generalizations of the higher Specht modules to related rings such as the Garsia-Procesi modules or even (we can dream!) the Garsia-Haiman modules.