The deceitful panacea of alt text

One of my favorite1 accessibility myths is this pervasive idea that alternate text is some kind of accessibility panacea. I get it – it’s theoretically2 a thing that content creators of any skill level can do to make their content more accessible. Because of these things (and because it is technically a required attribute on <img> tags in HTML), it seems to be one of the first things people learn about accessibility. For the uninitiated, alternate text (from here on out, alt text) is metadata attached to an image that assistive tech (such as a screen reader) will use to present a description of an image (since we don’t all have neural network coprocessors to do deep machine-learning and describe images for us).

This is all very good, if we have a raster-based image with no other information to work with. The problem is, we should almost never have that image to begin with. Very few accessibility problems are actually solved with alt text. For starters, raster images have a fixed resolution. And when users with limited vision (but not enough-so to warrant use of a screen reader) attempt to zoom in on these as they are wont to do, that ability is limited. Best case scenario, the image is at print resolution, 300dpi. This affords maybe a 300% zoom, and even then there may be artifacting. Another common pitfall is that images (particularly of charts and the like) are often used as a crutch when a user can’t figure out a clean way to present their information. Often this means color is used as a means of communicating information (explicitly prohibited by §508), or it means that the information is such a jumble that users with learning disabilities are going to have incredible difficulty navigating it.

Information often wants to fall into a particular structure. When I’m given a bar chart at work, in its original, non-rasterized form, I just structure it back into a table behind the scenes (in PDF). If you’re trying to communicate a message (particularly data), often part of the problem is that there’s a lot of information to communicate. This requires further structuring, and alt text is ‘flat’. By this, I mean, it lacks the capability to be structured – it’s generally restricted to paragraph breaks, if that.

An anecdote: in my professional life, I requested a customer provide original or recreated (but non-rasterized) versions of infographics in a document, and ended up with one that, when pasted into Word, yielded two pages worth of text. I explained to the customer that this was far too much content for alt text, for the reasons already mentioned. She responded that her ex-husband was blind, and how she had written it was exactly how he would have wanted to hear her read it. She failed to understand that if he knew what he was hearing was irrelevant to what he wanted to hear, he could ask her to skip ahead to the relevant bits. She failed to understand that if he missed how this piece of information related to the bigger picture (think, header row and column in a table), he could ask her. She failed to understand that she was not a robot and he probably enjoyed listening to her talk more than NVDA.

And it is here that we come to a major pitfall of accessibility work in general. Folks think that it’s enough to provide information, without any consideration for how that information is structured (or not). Pages worth of descriptions of a table are not a suitable replacement for an actual table where data always has context available if necessary. Data always has a sense of where it exists in two dimensions. Navigating among sections is a godsend when you’re trying to get through massive amounts of complex data, and fluffy tangential details are simply a waste of time when you’re listening to a robot. This is all on top of the issues that exist for folks struggling with poorly-rendered or poorly-designed images despite not using a screen reader.

Alt text is not a panacea. If it is to be used, it should be concise and clear, while presenting all of the relevant information a sighted user would grasp. If this is not possible, the image should not be rendered as a raster image, period. Listen to your alt text in a screen reader. Try to find a specific data point. If you get lost, find another way to present the information. If you stick with your rasterized image, drop the highest resolution version in that you possibly can. Print resolution is a minimum; 72dpi is for abled folks. Don’t use color as an exclusive means to communicate or associate data with meaning. Learn to resist images, and when you use them, learn to embed them in inherently machine-friendly ways.

Binaries and hex editors

Talking about certain files as ‘binaries’ is a funny thing. All files are ultimately binary, after all, it’s just a matter of whether or not a file is encoded as text. Even in the world of text, an editor or viewer needs to know how the text is encoded, what bytes map to what characters. Is a file ASCII, UTF-8, PostScript? Once we know something is text or not text, it’s still likely to be made to the standards of a specific format, lest it be nothing but plain text. Markdown, HTML, even PDF1 are human-readable text to an extent, with rules about how their content is interpreted. A human as well as a web browser knows that a <p> starts a paragraph, and this paragraph continues until a matching </p> is found.

If we open a binary in a text editor, we’ll see a lot of familiar characters, where data happens to coincide with printable ASCII. We’ll also see a lot of gibberish, and in fact some of the characters may cause a terminal to behave erratically. Opening a binary in a hex editor makes a little more sense of it, but still leaves a lot to be answered. In one column, we’ll see a lot of hexadecimal values; in another we’ll see the same sort of gibberish we would have seen in our text editor. In some sort of status display, we’ll also generally see a few more bits of information – what byte we’re on, its hex value, its decimal value, etc. Why would we ever want to do this? Well, among other things, binary file formats have rules as well, and if we know these rules, we can inspect and navigate them much like an HTML file. Take this piece of a PNG file, as it would appear in bvi (my hex editor of choice).

00000000  89 50 4E 47 0D 0A 1A 0A 00 00 00 0D 49 48 44 52 .PNG........IHDR
00000010  00 00 02 44 00 00 01 04 08 06 00 00 00 C9 50 2B ...D..........P+
00000020  AB 00 00 00 04 73 42 49 54 08 08 08 08 7C 08 64 .....sBIT....|.d
00000030  88 00 00 00 09 70 48 59 73 00 00 0B 12 00 00 0B .....pHYs.......
00000040  12 01 D2 DD 7E FC 00 00 00 1C 74 45 58 74 53 6F ....~.....tEXtSo
"ban_ln_560_NLW.png" 14498451 bytes    00000000 10001001 \211 0x89 137 NUL