The voice of a wizard hacking away

My pals at Sandy Pug Games have opened up preorders for WIZARDPUNK, a zine of various wizard stories and whatnot. It’s full of brilliant work, and I highly recommend checking it out! I have a little epistolary slice-of-life piece in it, which I’m honestly pretty proud of. In addition to this, I was asked if something rather curious was possible, if there was any way some audio-producing computer code could be squished down to a reasonable size such that someone could theoretically type it in. The final result of this exercise is in the zine as well, and it was the kind of code-golf-esque challenge that was so interesting to me that I essentially knocked it all out mentally while soaking in the tub.

The interesting bits are the constraints. The core explicit constraint is that it needs to be able to be typed in, but this brings with it implicit constraints to bring it from ‘theoretically typable’ to ‘someone might actually try this’. Length is an obvious one, and I think to not detract from the rest of the zine, a single spread was kind of ideal anyway. Actually being able to print and type it is another; we can’t just release raw binary data. Finally, while I initially wondered if there was a plausible way to have folks generate a .COM file or the like, ideally the code is something cross-platform and also recognizable to folks who are not nerd-ass single-board computer hobbyists like me.

What occurred to me was that we could embed a base64 BLOB of an MP3 file into an HTML5 audio element. Base64 means we’re using human-typable binary. HTML means literally everyone with a computer in 2020 can run it. Additionally, HTML is ubiquitous enough that a lot of people – even wizards who have never actually dealt with web design – will recognize it and know what to do with it with little to no direction. The remaining problem, and one that left my more skeptical than confident from the computer-free sanctity of my bath, was payload length.

Fortunately the audio clip I was going to be working with was essentially a vocal sample. I ended up converting it down to an 8kbps monaural MP3 file, which is… absurd. But, space-wise, everything was tight here. I initially tested with a 0.8 second sample of my voice, and was hopeful I might receive something comparable in length. The first sample I was sent was closer to two seconds, which sounds perfectly serviceable at first blush. But within our constraints, dealing with the scale we’re working with… doubling the size of the payload is significant to the point that it ultimately destroys the likelihood of the experiment remaining plausible1. Liam pared the clip down to one second, and we were back in business. It sounded crunchy and muddy, but 8kbps remained a sticking point. Going to 16kbps would double the payload which, again… is a big deal. It simply reduces the human-typable aspect far too much.

Even with these constraints in place, the payload is 2,572 characters for a hopeful wizard to theoretically type in. To make matters worse, even though they’re technically human-readable, they’re still seemingly random strings like M7wXBlcFATAE2V5eM. Formatting-wise, I tried to break them up into relatively short lines, but it’s still… a lot. This means that there’s a lot of room for error here. My understanding of MP3 is pretty limited. I assumed it doesn’t contain a checksum since it’s a streamable format, but I wasn’t entirely positive; if I was wrong, a single erroneous character might make a browser reject it. I also didn’t know how good (or bad) the error correction might be. This whole endeavour piqued my curiosity enough that I’ll likely try to read the spec some day, but for now… I just started replacing and deleting characters and mashing reload in my browser. Luckily, and somewhat surprisingly, it was actually very fault-tolerant! A sloppy wizard can make a considerable number of errors with little to no audible effect.

In the end, this was a fascinating experiment for me. None of it was particularly complicated, but figuring out these constraints and piecing together a viable solution? That’s the sort of wizard shit I can get behind. Anyway, go check out the wizard zine!

  1. The whole thing (payload and HTML wrapper) fills a two-page half-letter spread. It’s a lot, but it still feels manageable; you can see the entire thing in one go. Doubling this to four pages just feels far too daunting. ↩︎