The motive behind my last post on binary editors was a rather peculiar PNG I was asked to post as part of my job. It was a banner, 580x260px, and it was 14MB. Now this should have set off alarms from higher up the web chain: even with the unnecessary alpha channel, 580(px)×260(px)×(8(bits)×4(R,G,B,A)) is only 460KB or so. A very basic knowledge of how information is stored is always helpful – complicated file sizes are largely because of compression or encryption, neither of which applies here.
So what happened? Adobe Fireworks, which is completely unsurprising. Fireworks was a Macromedia project, and while Macromedia obviously shaped a large chunk of the web in their heyday and also into the Adobe years, Macromedia projects were shit. The very definition of hack. I’m certain Adobe learned all of their terrible nonstandard UI habits from their Macromedia acquisition. I never thought Fireworks was terrible, but nor did I find it impressive. It was often used for wireframing websites, which feels wrong to me in every single way. But, to get ahead of myself, it had one other miserable trick: saving layers and other extended data in PNG files. Theoretically, this is great: layer support in an easily-read compressed lossless free image format. Awesome! But in Adobe’s reality, it’s terrible: not even any current Adobe software can recover these layers.
As mentioned in my previous post, PNGs are pretty easy to parse: data comes in chunks: the first 4 bytes state the chunk length, then 4 bytes of (ASCII) chunk type descriptor, then the chunk data, then a 4 byte CRC checksum. Some chunks are necessary: IHDR is the header that states the file’s pixel dimensions, color depth, color type, pixel ratio, etc; IDATs contain the actual image data. Other chunks are described by the format but not necessary. Finally, there are unreserved chunks that anyone can use, and that this or that reader can theoretically read. The chunk type is 4 ASCII bytes, and is both a (potentially) clever descriptor of the chunk, and 4 bits worth of information – each character’s case means something.
So my image should have had a few things: the PNG magic number, 25 bytes worth of IHDR chunk explaining itself, ~460KB worth of IDAT chunk, and then an IEND chunk to seal the deal. Those were definitely present in my terrible file. Additionally, there were a handful of proprietary chunks including several hundred mkBT chunks. I don’t know much about these chunks aside from the fact that they start with a
FACECAFE magic number and then 72 bytes of… something… And I also know there are a lot of them. Some cursory googling shows that nobody else really knows what to make of them either, so I’m not sure I’m going to put more effort into it. Suffice it to say: Fireworks, by default, saves layers in PNG files, and this made a ~460KB file 14MB.
So why do the files even work? Well, remember I mentioned that case in a chunk descriptor is important – it provides 4 bits of information. Note the difference between the utterly important
IDAT and the utterly bullshit
mkBT. From left to right, lower vs. uppercase means: ancillary/critical; private/public; (reserved for future use, these should all be uppercase for now); safe/unsafe to copy. The important thing to glean here is that
mkBT is ancillary — not critical. We do not need it to render an image.
So, when we load our 14MB PNG in a web browser, the browser sees the
IDATs, and renders an image. It ignores all the garbage it can’t understand. This is perfectly valid PNG, because all of those extra chunks are ancillary, the browser can ignore them. PNG requires a valid
IDAT, so Fireworks must put the flat image there. So, it works, but we’re still stuck with a humongous PNG. Most image editors will discard all of this stuff because it’s self-described as unsafe-to-copy (meaning any editing of the file may render this chunk useless). But for reference,
pngcrush will eliminate these ancillary chunks by default, and
optipng will with the
-strip all flag.
Takeaways? Know enough about raw data to see that your files are unreasonably large, I suppose. Or automatically assume that a 14MB file on your homepage is unreasonably large, regardless. Maybe that takeaway is just ‘perform a cursory glance at your filesizes’. Maybe it’s flatten your images in Fireworks before exporting them to PNG. Maybe instead of just performing lazy exports, web folks should be putting the time in to optimizing the crap out of any assets that are being widely downloaded. Maybe I’m off track now, so my final thought is just — if it looks wrong, save your audience some frustration and attempt to figure out why.