Skip to main content

Minifying NESABC

If you'd prefer a colab check this out.

As part of my journey to use GPT-2 to generate chiptune music I needed to reduce the size of ABC files generated via the midi2abc utility.

Here's where we started:

Bytes per ABC song

With an average length of 3083 bytes.

I started by followed Gwern's tactic of removing spaces. That reduced the average length to 2929 bytes.

I started looking through my ABC files to see if I could find any opportunities to reduce their size.

Removing empty voices#

The first thing I noticed was that songs often have voices that are completely silent. In fact, 607 of my 3463 songs have a silent voice.

In the general case, you should remove a voice without notes. Because I'm dealing with songs generated from NES-MDB, they always have 4 voices and those voices correspond to particular hardware on the NES. I wanted that to be clear to GPT-2 so instead of removing and renumbering the voices I just left the notes completely.

For example:

X: 1T: from ../nesabc/midis/398_Ys_AncientYsVanishedOmen_08_09PalaceB3F.midM: 4/4L: 1/8Q:1/4=120% Last note suggests Phrygian mode tuneK:C % 0 sharpsV:1%%MIDI program 80z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:2%%MIDI program 81z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:3%%MIDI program 38D,/2z/2F,/2z/2 A,/2z/2E,/2z/2 D,/2F,/2z/2A,/2 z/2E,...V:4%%MIDI channel 10zz zz/2z/2 z/2z/2z/2z/2 z/2zz/2|z/2zz/2 z/2z/2z/2z/...

Becomes

X: 1T: from ../nesabc/midis/398_Ys_AncientYsVanishedOmen_08_09PalaceB3F.midM: 4/4L: 1/8Q:1/4=120% Last note suggests Phrygian mode tuneK:C % 0 sharpsV:1%%MIDI program 80z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:2%%MIDI program 81z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:3%%MIDI program 38D,/2z/2F,/2z/2 A,/2z/2E,/2z/2 D,/2F,/2z/2A,/2 z/2E...V:4%%MIDI channel 10

After this transformation, the average song dropped from 2929 to 2834 bytes.

Removing bars#

A line of notes looks like this:

z8|z6zA-|A2-A/2f3-f/2z/2^c3/2-|^c2 e3-e/2z/2g2|

Those pipes correspond to music bars. They make the the music more readable, but we don't care about that.

That change brought us from 2834 to 2767 bytes.

Removing comments, error messages, title, and ID#

These are the last things that aren't strictly required.

That change brought us down from 2767 to 2660.

Unfortunately, that's where I ran out of ideas.

Bytes per ABC song before minification

Bytes per ABC song after minification