Minifying NESABC
If you'd prefer a colab check this out.
As part of my journey to use GPT-2 to generate chiptune music I needed to reduce the size of ABC files generated via the midi2abc utility.
Here's where we started:
With an average length of 3083
bytes.
I started by followed Gwern's tactic of removing spaces. That reduced the average length to 2929
bytes.
I started looking through my ABC files to see if I could find any opportunities to reduce their size.
#
Removing empty voicesThe first thing I noticed was that songs often have voices that are completely silent. In fact, 607 of my 3463 songs have a silent voice.
In the general case, you should remove a voice without notes. Because I'm dealing with songs generated from NES-MDB, they always have 4 voices and those voices correspond to particular hardware on the NES. I wanted that to be clear to GPT-2 so instead of removing and renumbering the voices I just left the notes completely.
For example:
X: 1T: from ../nesabc/midis/398_Ys_AncientYsVanishedOmen_08_09PalaceB3F.midM: 4/4L: 1/8Q:1/4=120% Last note suggests Phrygian mode tuneK:C % 0 sharpsV:1%%MIDI program 80z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:2%%MIDI program 81z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:3%%MIDI program 38D,/2z/2F,/2z/2 A,/2z/2E,/2z/2 D,/2F,/2z/2A,/2 z/2E,...V:4%%MIDI channel 10zz zz/2z/2 z/2z/2z/2z/2 z/2zz/2|z/2zz/2 z/2z/2z/2z/...
Becomes
X: 1T: from ../nesabc/midis/398_Ys_AncientYsVanishedOmen_08_09PalaceB3F.midM: 4/4L: 1/8Q:1/4=120% Last note suggests Phrygian mode tuneK:C % 0 sharpsV:1%%MIDI program 80z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:2%%MIDI program 81z8|z6 zA-|A2- A/2f3-f/2 z/2^c3/2-|^c2 e3-e/2z/2 g2-|...V:3%%MIDI program 38D,/2z/2F,/2z/2 A,/2z/2E,/2z/2 D,/2F,/2z/2A,/2 z/2E...V:4%%MIDI channel 10
After this transformation, the average song dropped from 2929
to 2834
bytes.
#
Removing barsA line of notes looks like this:
z8|z6zA-|A2-A/2f3-f/2z/2^c3/2-|^c2 e3-e/2z/2g2|
Those pipes correspond to music bars. They make the the music more readable, but we don't care about that.
That change brought us from 2834
to 2767
bytes.
#
Removing comments, error messages, title, and IDThese are the last things that aren't strictly required.
That change brought us down from 2767
to 2660
.
Unfortunately, that's where I ran out of ideas.