The VCSthetic

The Atari VCS, better known as the 2600, was an important part of my formative years with technology. It remains a system that I enjoy via emulation, and while recently playing through some games for a future set of posts, I started to think about what exactly made so many of the (particularly lesser-quality) games have such a unique aesthetic to them. The first third-party video game company, Activision, was famously started by ex-Atari employees who wanted credit and believed the system was better suited to original titles than hacked-together arcade ports. They were correct on this point, as pretty much any given Activision game looks better than any given Atari game for the VCS. Imagic, too, was made up of ex-Atari employees, and their games were pretty visually impressive as well. Atari had some better titles toward the end of their run, but for the most part their games and those of most third-parties are visually uninspiring. Yet the things that make them uninspiring are all rather unique to the system:

There’s a recurring theme to this aesthetic: lots of things happening in horizontal space. The VCS is made up of three primary chips: the MOS 6507 processor1, the Television Interface Adaptor (TIA), and the 6532 RAM, I/O, and Timing chip. Of these, we’re primarily concerned with the TIA: it’s the closest thing we have to a video card, and its design is the primary force behind the aesthetic of the console. Before we dive into this, however, we need to know how televisions work. Or, at least how televisions worked in 1977.

Cathode Ray Tubes (CRTs) shoot electrons at a phosphorescent screen, causing it to glow. In a television in 1977, these electrons are focused and moved around the screen via electromagnetism. Well, ‘moved around’ is a sloppy explanation at best, but the signal is created by shooting the focused beams from left to right (that is, horizontally), one line (scan line) at a time, top to bottom. After the bottom line is complete, the beam returns to the upper left and begins again. The analog TV signal contains vertical sync (start up top) information, followed by signals for each line, prefaced by horizontal sync (start to the left) information. This is a gross oversimplification, but it’s accurate enough for the important takeaways: lines are generated horizontally, one at a time, from top to bottom; and synchronization information is built in to the signal. For as antiquated as they seem now, it’s a wonder CRTs ever worked — they are engineering marvels.

The TIA helps a VCS developer manage the above, but it barely helps. The developer needs to keep track of timing and fit their code within the timing requirements of the video signal. Advanced games like some of those lovely Activision titles changed things during the horizontal timing to push the system beyond its intended capabilities — a technique famously known as ‘racing the beam’. But the unique VCS aesthetic of simpler games comes from using that TIA as intended, and the TIA doesn’t know anything beyond what it’s doing to the current horizontal line. This sounds a bit untenable, but it’s worth keeping in mind that the VCS had to find ways to help developers without adding costly RAM beyond the 128 bytes afforded programs.

In the few ways that the TIA does help, it can only help as far as ‘doing stuff horizontally’ is concerned, because it doesn’t know a thing about the previous or next line. It helps by giving the developer a few graphics registers to play with, including a background… or, more accurately, half of a background which is then either repeated or reflected onto the other side of the screen (horizontally symmetric backgrounds). Also included are two player sprites, with the helpful ability to stretch them out to two or four times their width or repeat them two or three times inside that stretched width (horizontally stretched sprites; many copies sharing a horizontal plane). A final convenience exists under the assumption that many games would be of the Pong-like or shooting varieties: single-dot objects (optionally horizontally stretched) that matched the player or background color and offered positioning and collision detection without maintaining a complex graphical object (simple ‘balls’ and ‘missiles’).

These elements are all part of the low-res background. The dashed line shows where the background is reflected. Player 1 and Player 1’s missile. Player 2.

Player 2’s sprite is stretched horizontally, a function built into the TIA. Player 1’s sprites and missiles are cloned three times on the horizontal plane, a function built into the TIA. Low-res background again is only half of a screen worth, copied to the other side.

Combat, two screenshots of which are simulated above, shows most of these elements, and is very much an archetypal example of the aesthetic I’m referring to. Of the six types of objects (background, two players, two missiles, ball), we see symmetrical, low-resolution backgrounds, player sprites both repeated and stretched, and missiles. Without stretching the limits of the system (racing the beam), this was the sort of thing that you got. With a lot of developers jumping on the home-console bandwagon without trying to live up to the likes of an Activision, this chunky, stretchy, limited-palette2 aesthetic truly defines the system, and all for the sake of giving developers the bare minimum assistance in manipulating a signal for a display technology that operates on horizontal lines.

  1. It’s hard to overstate how important MOS Technology was to home computing. They used cutting edge MOSFET technology and pioneered the art of fixing the production masks, resulting in far higher viable chip yields than competitors. This, combined with a relatively simple chip design, made the 6500 series incredibly affordable for the time. ↩︎
  2. It’s worth noting that while many systems of yore had a unique aesthetic built around simply not supporting many colors (the Apple II and the typically 4-color CGA of the IBM PC come to mind), the VCS actually had a pretty rich palette! But there were only so many graphical objects, and all of them were 1-bit. Interestingly, this leads us to another pretty typical VCS aesthetic, albeit one that is not so much a deliberate function of the TIA. The fact that developers were constantly working with the knowledge of the system’s timing (down to where it was drawing on the screen) and the fact that all graphical objects had 1-bit color depth meant that quickly cycling the color of an object was an inexpensive effect that added a lot of visual impact. Since this, too, was bound by the line-by-line TIA behavior, it’s common to see a rainbow effect of colorful flashing horizontal strips in VCS games. Among others, Yars’ Revenge, Krull, and Swordquest: Earthworld are all first-party titles that make use of this effect. ↩︎

A few of my favorite: Tetrises (Tetrii? Tetrodes?)

I spent a couple of weeks writing this, and of course remembered More Thoughts basically as soon as I uploaded it. For starters, I had somehow completely forgotten about Minna no Soft Series: Tetris Advance for the GBA, which is a somewhat difficult to find Japanese release superior to Tetris Worlds in every imaginable way. Second, I neglected to mention leveling details and have updated the Puyo Puyo Tetris and mobile sections accordingly (as of 10-28).

Tetris, the ‘killer app’ of the Game Boy and proven-timeless time-sink has a pretty bizarre history. Alexey Pajitnov originally wrote it as a proof-of-concept for a Soviet computer that lacked graphics capability. Pajitnov’s coworkers ported the game to the IBM PC, and its availability on consumer hardware meant that unofficial ports popped up across the globe, and licensing deals were struck without Pajitnov’s involvement. Facing some difficult decisions regarding licensing, Pajitnov gave the Soviet Union the rights to the game. Licensing was then handled through a state-sponsored company known as Elorg (the famous Game Boy pack-in deal was during the Elorg era). During this period, brick colors and rules were inconsistent from this Tetris to that Tetris. Some games branded Tetris during this era bore next-to-no resemblance to the game we all know and love.

The Elorg deal was temporary by design, and some years later Pajitnov got the rights back and formed The Tetris Company. The Tetris Company has proven to be an absurdly aggressive intellectual property monster, which is hardly surprising given the game’s licensing history1. The Tetris Company has done one positive thing, though: standardized the rules and the colors of blocks into something known as the Tetris Guideline. This means that any Tetris from the late ‘90s and newer is largely interchangeable2 – and if you can make out the color of the next piece from the corner of your eye, you know what shape it is. The consistency is valuable, and even though years of NES Tetris have left me rather untalented at T-spins, all of my favorite Tetris games are of the modern sort. This also largely means that the distinction really boils down to hardware, but that’s kind of important when some form of the game has been released for pretty much any given system. So on that note, the four I most often reach for are:

Tetris (WonderSwan)
This one is solely about the hardware. The Bandai WonderSwan was a really clever handheld that never saw life outside of Japan. Designed by the original Game Boy’s creator, three iterations were made: one with a greyscale screen, and two with color screens. They’re not terribly expensive to acquire, but without a knowledge of Japanese, the playable library is quite limited. Tetris, of course, is an exception – which is why it routinely fetches ~$100 on eBay. One of the most unique features of the WonderSwan was an underutilized additional set of buttons that allowed games to be designed for either portrait or landscape play. Tetris plays portrait, and for that reason alone, the WonderSwan version is one of the most satisfying. It’s also the only WonderSwan color game that I’m aware of (it’s very possible there are others) that works on an original greyscale WonderSwan as well (like Game Boy DX titles). Being one of the earliest games to start adhering to a version of the Guideline, some of the gameplay seems a little off by modern standards – the game speeds up much quicker than a 2018 Tetris, and I’m fairly certain it doesn’t use the same shuffled/“bag” randomization algorithm as later games. Still, despite its quirks, its price tag, and its reliance on an obscure system, the WonderSwan’s Tetris remains among my favorites.
Puyo Puyo Tetris (Switch)
Puyo Puyo Tetris is a mash-up of Sega’s Dr. Mario-esque game Puyo Puyo and Tetris. I rag on this one a lot because I don’t really enjoy Puyo Puyo, and if you play the game’s charming story mode… you have to play a lot of Puyo Puyo. But, outside of the story mode, you can just play a regular game of Tetris, complete with the encouragement of the characters from story mode. It’s oddly satisfying to hear Ess inexplicably shout ‘Lipstick!’ when you clear a line. Puyo Puyo Tetris came out for the 3DS first, but only in Japan (and the 3DS is region-locked). The Switch release clearly wins for availability and localization, but I also think the Switch’s more tactile controls are better-suited for Tetris. Puyo Puyo Tetris uses ‘fixed leveling,’ where every ten lines cleared level you up. This makes for a quicker and more frantic play than versions with ‘variable leveling’. There is one huge misstep with Puyo Puyo Tetris, and that is the complicated path one must take to play an Endless game of Tetris. If you just choose the Tetris mode from the main menu, the game will stop after level 15. You must navigate several menus, and turn on Endless every time or else face serious disappointment. I’ve never understood why this tends to be the default mode, Tetris is not a game to be won, Tetris is a game of pushing yourself to your own limit.
Tetris Premium/Tetris (iOS)
There are at least three versions of Tetris on iOS (the situation on Android seems similar): Tetris Blitz, which is a two-minute version, and Tetris and Tetris Premium, which both contain a normal (Marathon) Tetris mode. Tetris is ad-supported with an in-app purchase to remove ads. Tetris Premium is paid, but costs less than removing ads from Tetris. Tetris also contains one additional gameplay mode, but ultimately you can download Tetris, and if you only find yourself playing the standard game, buy Tetris Premium. Bit of a mouthful, all that. These mobile versions use ‘variable leveling,’ where the line clears required to level up are 5 times the current level number. Compared to Puyo Puyo Tetris’s fixed leveling, this makes for a longer game and one that (to me) is better-paced. Playing on a smartphone has the same advantage as the WonderSwan: glorious portrait orientation. The obvious downside is the touch controls. There’s a weird mode that places phantom pieces in places the game suspects you might want them, but that somehow simultaneously feels like cheating and removes the rhythm enough that the game feels very different (and at times more difficult). The swipe controls, however, are… surprisingly manageable. Don’t get me wrong, at some point it will register wrong and throw off your entire stack, but for the most part it’s very playable. Swipe/hold left/right/down to move a piece, swipe down to hard drop, tap left/right half of screen to rotate, and swipe up to hold. I’m certainly not an expert player, but I have managed to make it to level 22 with the touch controls. They work pretty well.
And an honorable mention, Tetris Worlds (Game Boy Advance (GBA))
This contentious title kicked off the Tetris Guideline. Its ‘endless spins’ drew the most criticism, and it does make for a significantly different game than older versions without. Tetris Worlds was released for a few different systems, but I’m concerned with the GBA version. With their relatively small screens, none of the GBA-compatible systems (GBA, GBA SP, Game Boy Micro, DS, and DS Lite) are the most visually-spectacular3 systems for Tetris, but among those five systems is a lot of varied hardware with varied use-cases. Notably, I think the clicky buttons of the GBA SP makes for a responsive, tactile experience, and Tetris Worlds on the Game Boy Micro is probably the best pocketable Tetris experience around (those keychain doodads are not great, and Pokémon Tetris on the Pokémon Mini is an expensive experience that pales in comparison to Tetris Worlds on the Micro). GBA’s Tetris Worlds does have two glaring issues: the animated backgrounds are distracting, and there is no Endless (Marathon tops out at level 15). The latter problem a huge deal-breaker, but I still think it’s great loaded up in a Game Boy Micro and tossed into a bag.

  1. According to this article by one of the PC port-developers, Vadim Gerasimov, Pajitnov’s plan was always to make money off of the game, which “[…] seemed unusual and difficult because we lived in the Soviet Union. Making and selling something privately was highly irregular.” While I don’t agree with the aggressive IP strategies of The Tetris Company, I can understand how fired-up an ex-Soviet capitalist-at-heart who created and lost control of a wildly successful product would be. ↩︎
  2. The Tetris Guideline has changed over the years. Piece colors were locked in at the start, which is good. There’s enough consistency between Guideline versions, and few enough versions that any modern Tetris game should feel pretty familiar. ↩︎
  3. One visually-spectacular element of Tetris Worlds was its cover art, by Roger Dean. Dean also redid the Tetris logo, which continues to be used to this day. ↩︎

Get angry again (Unicode edition)

So, it’s a bit of a recurring theme that this administration makes some horrifying attack on some marginalized group and I feel the need to make some brief post here angrily tossing out organizations worth donating to. Of course, the topic this week is a series of actions threatening trans people1 and hearkening back to the 1933 burning of the archives of the Institut für Sexualwissenschaft. I’m personally feeling less and less in control of how I’m handling the erosion of civil liberties, and part of me right now needs to write, beyond a brief scream into the ether. So here’s what this post is: if anything on this site has ever had any value to you, please just roll 1D10 and donate to:

  1. Trans Lifeline
  2. National Center for Transgender Equality
  3. Transgender Law Center
  4. Transgender Legal Defense & Education Fund
  5. Sylvia Rivera Law Project
  6. Trans Justice Funding Project
  7. Trans Women of Color Collective
  8. Trans Student Educational Resources
  9. Lambda Legal
  10. Southern Poverty Law Center

…and with that out of the way, for the sake of my own mental health, I’m going to quasi-continue my last post with a bit of binary-level explanation of text file encodings, with emphasis on the Unicode Transformation Formats (UTFs).

⚧ rights are 👤 rights!

…is a topical message made succinct via the vast character repertoire of Unicode. Note that if the above looks like ‘� rights are � rights!’, the first potentially unsupported character should be the transgender symbol and the second should be the human bust in silhouette emoji. These are Unicode code points 26A7 and 1F464, respectively. This is important: every other character falls under the scope of ASCII and therefore requires only a single byte. The transgender symbol requires two bytes, and the emoji requires three. So let’s see how this plays out.

All the sample hex dumps that follow were output from xxd, which uses a period (.) in the (right-hand side) ASCII display to represent non-ASCII bytes. In the text encodings that don’t support two- or three-byte code points, I have replaced these with an asterisk (*, hex 2A) prior to writing/dumping. ASCII is one such encoding – it supports neither character. So, let’s take a look at our string, ‘* rights are * rights!’:

00000000: 2A 20 72 69 67 68 74 73 20 61 72 65 20 2A        * rights are *
0000000e: 20 72 69 67 68 74 73 21 0A                        rights!.

Presumably this is obvious, but ASCII has a very limited character repertoire. In reality a 7-bit encoding, ASCII at least had the very important role of being an early standardized encoding, which was great! Before ASCII2, any given system’s text encoding was likely incompatible with any other’s. This kind of fell apart when localizations required larger character repertoires, and the eighth bit was used for any number of Extended ASCII encodings. Because ASCII and a number of Extended ASCII encodings standardized under ISO 88593 were so widely used, and are still so widely used, backward-compatibility remains important. In a very loose sense, Unicode could be seen as an extension onto ASCII – the first (U0000) section of code is ASCII exactly. So, ASCII is limited by 7-bits, various Extended ASCIIs are limited to one byte, what does our byte stream look like if we open this up to two bytes per character?

00000000: 26 A7 00 20 00 72 00 69 00 67 00 68 00 74        &.. .r.i.g.h.t
0000000e: 00 73 00 20 00 61 00 72 00 65 00 20 00 2A        .s. .a.r.e. .*
0000001c: 00 20 00 72 00 69 00 67 00 68 00 74 00 73        . .r.i.g.h.t.s
0000002a: 00 21 00 0A                                      .!..

UCS-2 is about the most straightforward way to expand the character repertoire to 65,355 characters. Every single character is given two bytes, which means suddenly we can use our transgender symbol (26 A7), and all of our ASCII symbols now essentially have a null byte in front of them (00 72 for a lowercase r). There are a lot of 00s in that stream. xxd shows us an ampersand toward the beginning, since 26 is the ASCII code point for &. xxd throws up dots for all the null bytes. Unicode 11.0’s repertoire contains 137,439 characters, a number greater than 65,355. Our emoji, as mentioned, sits at code point 1F464, beyond the FFFF supported by UCS-2 (and therefore replaced with an asterisk above). We can, however, encode the whole string with UCS-4:

00000000: 00 00 26 A7 00 00 00 20 00 00 00 72 00 00        ..&.... ...r..
0000000e: 00 69 00 00 00 67 00 00 00 68 00 00 00 74        .i...g...h...t
0000001c: 00 00 00 73 00 00 00 20 00 00 00 61 00 00        ...s... ...a..
0000002a: 00 72 00 00 00 65 00 00 00 20 00 01 F4 64        .r...e... ...d
00000038: 00 00 00 20 00 00 00 72 00 00 00 69 00 00        ... ...r...i..
00000046: 00 67 00 00 00 68 00 00 00 74 00 00 00 73        .g...h...t...s
00000054: 00 00 00 21 00 00 00 0A                          ...!....

…even more 00s, as every character now gets four bytes. Our transgender symbol lives on as 00 00 26 A7, our ASCII characters have three null bytes (00 00 00 72), and we can finally encode our emoji: 00 01 F4 64. You’ll see an errant d in the ASCII column, that’s xxd picking up on the 64 byte from the emoji. These two- and four-byte versions of the Universal Coded Character Set (UCS) are very straightforward, but not very efficient. If you think you might need to use characters above the FFFF range, suddenly every character you type requires four bytes – if this was for the sake of a single character, your filesize could nearly double. It could nearly quadruple if the majority of your file was characters from ASCII. So the better way to handle this is with the Unicode Transformation Formats (UTFs).

00000000: E2 9A A7 20 72 69 67 68 74 73 20 61 72 65        ... rights are
0000000e: 20 F0 9F 91 A4 20 72 69 67 68 74 73 21 0A         .... rights!.

UTF-8 is essentially the standard text encoding these days. Both the World Wide Web Consortium and the Internet Mail Consortium recommend UTF-8 as the standard encoding. It starts with the 7-bit ASCII set, and starts setting high bits for multi-byte characters. In a multi-byte character, the first byte starts with binary 110, 1110, or 11110, depending on how many bytes follow (one, two, or three, respectively). These bytes all begin with 10. Our transgender symbol requires three bytes: E2 9A A7. The A7 is familiar as the end of the codepoint, 26A7, but the first two bytes are not recognizable because of the above scheme.

If we break 26A7 into 4-bit binary words, we get…

2    6    A    7
0010 0110 1010 0111

…and E29AA7 is…

E    2    9    A    A    7
1110 0010 1001 1010 1010 0111

E is our 1110 that signifies that the next two bytes are part of the same character. The next four bits are the beginning of our character, the 2 or 0010. The two following bytes are made up of two 10 bits, and six bits of code point information, so effectively our 26A7 is actually broken up like…

2    6/A…   …A/7
0010 011010 10011

…and we see that in reality, it was mere coincidence that our three-byte version ended in A7. The 7 is a given, but the A happened by chance. UTF-8 is a great format as far as being mindful of size is concerned, but it’s less than ideal for a user who needs to examine a document at the byte level. While code point 26A7 will always translate to E29AA7, a whole second mapping is needed, and the variable byte size per character means that a hex editor’s word size can’t be set to correspond directly to a character. At least it’s fairly easy to suss out at the binary level. UTF-16 looks like:

00000000: 26 A7 00 20 00 72 00 69 00 67 00 68 00 74        &.. .r.i.g.h.t
0000000e: 00 73 00 20 00 61 00 72 00 65 00 20 D8 3D        .s. .a.r.e. .=
0000001c: DC 64 00 20 00 72 00 69 00 67 00 68 00 74        .d. .r.i.g.h.t
0000002a: 00 73 00 21 00 0A                                .s.!..

UTF-16 is used internally at the OS level a lot, and fortunately doesn’t really make its way to end-users much. We can see that our transgender symbol, 26 A7 comes out unscathed since it takes only two bytes. Our emoji shows up as D8 3D DC 64, and the way we get there is very convoluted. First, UTF-16 asks that we subtract (hex) 10000 from our code point, giving us F464. We pad this so that it’s twenty bits long, and break it into two ten-bit words. We then add hex D800 to the first and DC00 to the second:

Original: F4         64
Ten-bit:  0000111101 0001100100
Hex:      003D       0064
Plus:     D800       DC00
Equals:   D83D       DC64

This has the same human-readability issues as UTF-8, and wastes a lot of bytes in the process. Next up would be UTF-32, but seeing as that puts us in four-bytes-per-character territory… It is functionally identical to UCS-4 above4.

All of this information is readily available elsewhere, notably in Chapter 2, Section 5 of The Unicode Standard. I haven’t seen a great side-by-side comparison of UCS and UTF formats at the byte level before, with a focus on how binary data lines up with Unicode code points. UTF-8 is the ‘gold standard’ for good reason – it allows the entire character repertoire to be represented while requiring the least amount of data. However, there are times when it’s necessary to examine text at the binary level, and for a human, this is much easier accomplished by reëncoding the text as UCS-4/UTF-32 and setting a 32-bit word size in your hex editor.

If you’ve made it this far into a post about turning numbers into letters, I have one more thing to say… Please get out and vote, eligible American citizens. Our civil liberties are at stake.

  1. When I started writing this post, there was ‘just’ the leaked memo, and the longer I took, the more attacks they piled on. It’s beyond cruel. ↩︎
  2. There were other standards before ASCII, notably IBM’s EBCDIC. This was certainly not as widely-supported as ASCII, nor is it widely used today. ↩︎
  3. The ISO 8859 standards are replicated under a number of publicly-available ECMA standards as well: ECMA-94, ECMA-113, ECMA-114, ECMA-118, ECMA-121, ECMA-128, and ECMA-144. ↩︎
  4. I think historically there have been some differences between UCS-4 and UTF-32, like UTF-32 being stricter about code point boundaries. However, Appendix C, Section 2 of The Unicode Standard states that “[UCS-4] is now treated simply as a synonym for UTF-32, and is considered the canonical form for representation of characters in [ISO/IEC] 10646.” ↩︎

Honey walnut, please

Apple recently stirred up a bit of controversy when they revealed that their bagel emoji lacked cream cheese. Which is a ridiculous thing to get salty over, but ultimately they relented and added cream cheese to their bagel. Which should be the end of this post, and then I should delete this post, because none of that matters. But it isn’t the end, because I saw a lot of comments pop up following the redesign that reminded me: people really don’t seem to get how emoji work. Specifically, I saw a lot of things like ‘Apple can fix the bagel, but we still don’t have a trans flag’ or ‘Great to see Apple put cream cheese on the bagel, now let’s get more disability emoji’. Both of those things would, in fact, be great1, but they have nothing to do with Apple’s bagel suddenly becoming more edible.

Unicode is, in its own words, “a single universal character encoding [with] extensive descriptions, and a vast amount of data about how characters function.” It maps out characters to code points, and allows me to look up the division sign on a table, find that its code point is 00F7, and insert this into my document: ÷. Transformation formats take on the job of mapping raw bytes into these standardized code points – this blog is written and rendered in the transformation format UTF-8. Emoji are not pictures sent back and forth any more than the letter ‘A’ or the division sign are – they are Unicode code points also, rendered out in a font2 like any other character. This is why if I go ahead and insert 1F9E5 (🧥), the resulting coat will be wildly different depending upon what system you’re on. If I didn’t specify a primary font for my site, the overall look of this place would be different for different users also, as the browser/OS would have its own idea of a default serif font.

This mapping, these code points, they are defined by The Unicode Consortium. The Consortium takes in proposals3, makes decisions on character proposals as well as technical matters, makes drafts, does all sorts of behind-the-scenes junk, and spits out the Unicode Standard. Major revisions to the Unicode Standard then become an ISO Standard (10646 Information Technology — Universal Coded Character Set (UCS)). And while Apple is a voting (full) member of the Consortium, adding new characters (even emoji) is a serious process, much different from having a graphic designer slap some paint on a doughy circle.

“How characters function” is an important aspect of emoji as well. Much like I can use the combining diaresis, 0308 ( ̈), with an ‘a’ to make ‘ä’, combinations of glyphs4 work to bring skin tones and gender markers to emoji. So, again, when I saw people (in jest, I truly hope) suggesting that Apple allow users to choose their bagel topping much like they would skin tone… well, it’s not a very effective joke when that too is a function of the Unicode Consortium.

Philadelphia Cream Cheese ran a handful of Twitter ads over the whole controversy, and now that the dust has settled, they ran one thanking Apple and the Unicode Consortium, which… is largely wrong in the other direction, since the glyph itself is entirely on Apple. Part of the Unicode Standard, however, is multilingual descriptive text for characters. Emoji are annotated under the CLDR Character Annotations, given a short name as well as comments that may offer other explanations. So, 1F404, COW, is helpfully also described as potentially representing ‘beef (on menus)’, and (likely against the wishes of some members)5 1F4A9, PILE OF POO, ‘may be depicted with or without a friendly face’. The notes on bagel do, in fact, suggest that it can represent ‘schmear’, so perhaps in some way Unicode was to thank by subtly suggesting that bagels are canonically coated.

All of this to say that, while Apple and Google are both (among others, of course) high-level members of the Unicode Consortium, it is just that – a consortium of contributors that go through an involved process to create a functional international standard mapping of characters from A-Z to hieroglyphics to the vegetable pictograms we pepper our sexts with. Changing the visual representation of a character in the emoji font is a much less daunting task than changing an ISO standard. Which is why shouting at Apple on Twitter is unlikely to get a trans flag emoji introduced, but submitting a proposal to the Unicode Consortium just might.

  1. A handful of disability emoji have been assigned code points as draft candidates, including prosthetics, an ear with a hearing aid, the ASL sign for deafness, and guide and service dogs. Hopefully these make it in! ↩︎
  2. Because of various compatibility/completion issues, some platforms seem to do their own thing as far as handling emoji. But when things are done as per usual, it’s just a font that is selected as the best choice when the font being used lacks its own TEACUP WITHOUT HANDLE glyph. Same thing happens for non-emoji characters that are missing from whatever the font at hand is. ↩︎
  3. Non-emoji proposals are discussed at ‘How to Submit Proposal Documents’ and ‘Submitting Character Proposals’. ↩︎
  4. A primer, just in case… The code point is a number in hexadecimal that maps a byte or series of bytes to a character: 1F351. A character is the definition of what exists at that code point: Emoji, Peach. A glyph is a graphical representation of a character, typically presented through a font: 🍑. ↩︎
  5. A frowning pile of poo was proposed for Unicode 11, and some members revealed their true horror that the poo was ever anthropomorphized in the first place. While the battle over the frowning poo is very on-brand for 2018, little about it is as satisfying as reading the original proposal. ↩︎

JPEG Comments

A while back, floppy disk enthusiast/archivist, @foone posted about a floppy find, the Alice JPEG Image Compression Software. I suggest reading the relevant posts about the floppy, but the gist is that @foone archived and examined the disk and was left with a bunch of mysterious .CMP files which appeared to have JPEG streams but did not actually function as JPEGs. Rather, they would load but only displayed an odd little placeholder1, identical for each file. I know a bit about JPEGs, and decided to have a hand at cracking this nut. The images that resulted were not particularly interesting – this was JPEG compression software from the early ‘90s, clearly targeted at industries that would be storing a lot of images2 and not home users. The trick to the files, however, was a fun discovery.

The title of this post gives it away, I realize – the real images were effectively ‘commented out’. Here’s a hex dump of the relevant chunk of one of the photos:

(offset)   (hex)                                              (ascii)
00000270   F4 F5 F6 F7 F8 F9 FA FF C0 00 11 08 00 3C 00 50    .............<.P
00000280   03 01 21 00 02 11 01 03 11 01 FF DA 00 0C 03 01    ..!.............
00000290   00 02 11 03 11 00 3F 00 FF FE 00 0E 49 4D 41 47    ......?.....IMAG
000002A0   45 20 44 41 54 41 3D 3E C6 B1 3F E8 51 7F BB 52    E DATA=>..?.Q..R
000002B0   13 5E 64 BE 26 7D 65 1F E1 C7 D1 08 EA DB 73 B4    .^d.&}e.......s.

Something sure looks suspicious in that ASCII column, doesn’t it? Let’s talk briefly about JPEG files. JPEGs contain a number of different sorts of data: EXIF/metadata, the Huffman and quantization tables used to compress the image, information about the details of the image (bit depth, dimensions), and the image data itself, to name a few. All of this information is split up into chunks prefixed with a two byte code: FF followed by another byte that says what the data that follows is. At offset 277, we see FF C0. This is the start of frame, and the next seventeen bytes tell us (among other things) that it’s an 8-bit/channel color image, 80x60 pixels3. At offset 28A, we run into FF DA, which is the start of the image itself. This only runs for 11 bytes, until we hit FF FE at offset 298. Those 11 bytes are the odd little placeholder image from above, and FF FE is, as you can probably guess, a comment.

Comments aren’t that prevalent in JPEGs. JFIF, EXIF, and XMP data are all stored in application-specific data chunks (much like the layer information in Adobe Fireworks PNGs). Comments are typically used to mark what encoder produced the JPEG, and that’s about it. But, much like using comments to soft-delete code, an entire image can be stuffed in there, waiting for a specific decoder (or hex editor user) to erase the placeholder image and the comment prefix. Presumably this is just what the Alice software did: it would find FF DA, and ignore everything until after FF FE 00 0E IMAGE DATA=>. Other decoders would simply ignore the real image, because that’s what the JPEG spec tells them to do.

Not seen above, but necessary for the process to work and interesting to consider is the sequence FF 00. All markers, including comments, are only terminated upon encountering another FF byte. Once you get to the compressed image data, you’re likely going to need FF bytes that aren’t instructions to the decoder. These are essentially escaped by the two-byte sequence FF 00. The decoder knows that this is not the start of a new chunk, but rather a literal FF. This works across the board – which means that our commented-out image can (and does) contain several FF 00 sequences, and the decoder does not interpret this as termination of the comment.

Finally, it’s worth noting that JPEG images end with FF D9, which are (expectedly) the last two bytes in any of the given .CMP files. The placeholder image doesn’t need its own FF D9, since the one at the end of the file is the next marker that’s encountered after the comment regardless. In fact, doing so likely would have required additional logic in the Alice placeholder-removal scheme, as you would now have to ignore the end-of-image marker under (exactly) one specific condition on top of everything else.

This is obviously not a robust form of copyright protection, and seemingly lends itself to an inefficient set of Huffman and quantization tables as well. These inefficiencies could likely be handled better by modern encoders designed around needing tables for two images, and it is interesting to think of potential use-cases. One could, theoretically, comment out encrypted image data, while leaving a placeholder image that tells a user as much4. Practical? Likely not, but as much as us code nerds take our ability to comment out code for granted, it’s rather fascinating to see the same techniques played out in the binary sphere.

  1. I have a policy against including raster images in my posts, but this seems like a perfectly valid time to make an exception: Two 8 x 8 grids of greyscale pixels that start highly contrasted in the upper left and fade to the lower right.. ↩︎
  2. Among other things, the images seemed to be of forensic evidence, automobile damage, a patient’s teeth, and a museum artifact. ↩︎
  3. The sample image chosen for this demonstration was a thumbnail; 80x60 are actually the correct dimensions and not part of the placeholder. ↩︎
  4. Some data would likely still get through – image dimensions, for one. Presumably these are not sensitive, but what about the Huffman tables? How much information can be gleaned from the bits and bobs that define the image compression? Could these data chunks be encrypted and commented out as well? ↩︎