brhfl.com

Antiquine

A quine is a program that does nothing but output its own source code. In various programming challenges (notably, on PPCG) the word is often generalized to refer to those which are somehow related to this behavior – print the source code backward, input n and print n copies of the source, etc. An interesting challenge from 2013 floated to the top of PPCG this week a few weeks ago (I’ve been sitting on this post for a while), Print every character your program doesn’t have. While I don’t particularly feel the need to dive into everything I attempt on PPCG, this was a very interesting challenge in how seemingly trivial decisions that appear to shrink the code could have very well ended up growing it.

The premise was, for the printable ASCII set, print every character not present in your source code. In dc, the sort of baseline solution to beat is one which simply contains and comments out every single printable ASCII character, # !"$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ – 95 bytes. So that’s the thing to beat. Even if I couldn’t beat it, I’m not sure I would have submitted that – it’s just so boring1. My strategy was to:

The program, then, starts by pushing a list of ASCII code points that we need to exclude onto the stack, which I’ll get to after the fact. The meat of the program is [l.dP]s;34s.[dl.!=;s,l.1+ds.127>/]ds/x. This defines two macros, ; and / and runs / (as main essentially). Originally I had thought I would name all of my registers (macros included) with characters otherwise already used in the code, so as to not add bytes to the exclusion list. But because I (probably) needed every digit between 0 and 9, my list was already going to be partially incrementally generated, and since I needed > and < which are just a few code points away from 9, it was actually beneficial to introduce the ‘missing’ characters into the code to factor more into the incremental generation process. Theoretically, I could comment out some of these characters, but then I’d have to add the comment character, # to the list as well.

The macro / is rather straightforward. It duplicates top of stack, compares it to the counter ., runs the printing macro ; if they don’t match, ‘drops’2 the top of stack (which has an extra value on it if ; ran), increments the counter ., and continues running as long as we’re still below 127. The macro ; does nothing save for printing the ASCII value of . and leaving some padding on the stack – since / always deletes an item from the stack, we want it to be unimportant unless it’s removing an item from the list of exclusions.

Our list of exclusions is 120d115 112 108 100 93 91 80 62[d1-d48<:]ds:x45 43. We’re printing from low to high, so we make sure that from top to bottom, the stack is also arranged low to high. Since we compare our current ASCII value to this list of exclusions every time, we need to make sure there’s always something on the stack to compare to, so our largest/last value is duplicated. There’s a little decrementing macro in there to automatically push every value from 62 to 48 – the digits 0-9 as well as the aforementioned nearby characters that I used for macro names, etc. I tried generating the list in a handful of different bases – either to reduce bytes with fewer digits or to reduce exclusions with fewer possible numerals. In every case, the code to switch bases plus the overhead of adding that command to the exclusions list made it longer than simply using decimal.

This was an interesting challenge, one that I didn’t readily think I could beat the straightforward solution to. Many of the things that worked in my favor were counterintuitive – like introducing more characters that I had to exclude.


Golfing in Eukleides

Eukleides is decidedly not a golfing language, but when a geometry-related question came up on PPCG, I had to give it a shot. Eukleides can be quite rigid; for starters it is very strongly typed. Functions are declared by what they intend to return, so while set would be the shortest way to declare a function, it can’t really be exploited (unless returning a set is, in fact, desired). Speaking of sets, one thing that is potentially golfable is ‘casting’ a point to a set. Given a point, p, attempting a setwise operation on p will fail because a point does not automatically cast to a set (strict typing). p=set(p) will overwrite point p with a single-item set containing the point that was p. If, however, it is okay to have two copies of the point in the set, p=p.p is three bytes shorter.

If user input is required, the command number("prompt") reads a number from STDIN. The string for the prompt is required, though it can be empty (""). Thus, if more than four such inputs (or empty strings for other purposes) are required, it saves bytes to assign a variable with a one-letter name to an empty string.

Whitespace is generally unavoidable, but I did come to realize that boolean operators do not need to be preceded by whitespace if preceded by a number. So, if a==7or a==6 is perfectly valid. aor a, 7ora, 7 or6 are all invalid, however. This may be an interpreter bug, but for the time being it is an exploitable byte.

Finally, loci. Loci are akin to for-loops that automagically make sets. Unfortunately, they don’t seem to care much for referencing themselves mid-loop, which meant that I couldn’t exploit how short of a construction they are compared to manually creating a set in a for-loop.

This was a fun exercise, and just goes to show that if you poke around at any language enough, you’ll find various quirks that may prove useful given some ridiculous situation or another.


Nearest Fibonacci number in dc

I hadn’t really exercised my code golf skills in a while, but saw a fun challenge today that seemed like an easy solve, well-suited for dc. The challenge was, given a positive integer, return the nearest Fibonacci number. The Fibonacci sequence is fun on its own in a stack language; in dc one step can be performed with something like dsf+lfr. Given that the stack is prepared with two integers, this will duplicate the top, store it in an intermediate register f, add the two previous values, push f back onto the stack, and then swap them such that they’re in the correct order. It doesn’t pollute the stack, either, these two values are all that remain.

For this challenge, those two values are all I ever need – my input (which I will refer to as i) is always going to fall between two consecutive numbers in the Fibonacci sequence (or on one of them, in which case I would only need that value to test against). Keeping as much work on the stack as possible is ideal when golfing in dc because the byte cost of working with registers adds up quickly. So my strategy is to seed the Fibonacci generator with two 1s, and run it until the larger of the two Fibonacci numbers is greater than i. One of those two Fibonacci numbers will be the right one, and if i happened to be a Fibonacci number, I’ve just generated an extra one for no reason. I convert both of the Fibonacci numbers to their respective differences from i. Since I know for a fact that the top of the stack is greater than i, and the second value on the stack is either less than or equal to i, I don’t have to worry about dc’s lack of an absolute value mechanism; I simply subtract i from the big one and subtract the small one from i. Since I know which difference is which, I have no need to retain the Fibonacci numbers. I simply compare the differences, and then reconstruct the Fibonacci number by adding or subtracting the difference to i depending on which difference ‘won’. The code:

?si1d[dsf+lfrdli>F]dsFxli-rlir-sd[lild-pq]sDdld<Dli+p

…and the explanation:

?si                  Store user input in register i
1d                   Push 1 to stack, duplicate it to seed Fibonacci sequence
[dsf+lfrdli>F]dsFX   Macro F: Fibonacci generator as described above, with the
                       addition of loading register i and continuing to run F
                       until the top-of-stack Fibonacci number is larger than i
li-                  Turn our high Fibonacci number into its difference
rlir-                Swap the stack, and turn our low Fibonacci number into its
                       difference. The stack stays swapped, but it doesn't 
                       matter
sd                   Store our low Fibonacci difference in register d
[lild-pq]sD          Macro D: reconstruct the low Fibonacci number by 
                       subtracting d (the difference) from i; print the result
                       and quit. Macro is stored for later use.
dld<D                Duplicate the high Fibonacci difference and push the low 
                       Fibonacci difference onto the stack. Compare them and run
                       D if the low difference is less than the high difference
li+p                 Else, add i to the high difference and print the result


wo: A truth machine in wo3

In wo3, we get two extra instructions for a total of four. OISCs can be Turing complete, so four instructions should be enough to give us some brunt. At this point, some amount of math is probably a good idea. Instruction three, then, is ADD, which does what you think it does. Since we’re capable of entering negatives, we can pretty much get to any number we want at this point, albeit terribly inefficiently.

Branching would be very convenient at this point as well, so a simple SKIPZERO instruction gets the no. 4 position; it pops a value from the stack and if that value is zero, it skips until past the next instruction. All odd (input) words are skipped until an even (instruction) word is encountered, at which point that is skipped and the next word is interpreted. Skip if zero is a standard simple conditional branch, but I’ll likely change this to skip if (skip register), so a user can theoretically set the branch condition via a register (with a command certain to be available in wo4).


wo: Numbers

A draft of this post began:

A few thoughts on numbers in wo. Numbered input is pushed to the stack via LSB==1. LSB==0 signifies an instruction, so in either case, effective word size is 1 bit less than total word size. The stack itself won’t care what sort of numbers it’s holding, input is the tricky part. wo3 would allow you to enter the integers 0-3, or 2-1 depending on unsigned vs. signed input. wo4 0-7 or 4-3. These aren’t big numbers to work with, so getting instructions in early to allow for more complicated input would be welcome. The easiest solution here is simply getting the WORDSIZE instruction out of the way as early as possible, ideally in two bits. With signed input, this would only get a user up to a 4-bit word size with 6 bits of input. Unsigned, a user could get up to 6-bit word size in the same 6-bit input, which raises the question of whether allowing a single word input of 1 is more important than more immediate access to slightly higher integers. Signed input would reduce the need for a sign-changing instruction at the low end of the set, and there are little hacks that could be put in place – for example, if 1s’ complement was used instead of 2s’, 0 could act as a stand-in for the next highest byte size or the like. Thus 6 bits of input could immediately kick word size up to a full byte.

As outlined in this post killing off WORDSIZE, I have changed my thoughts on that matter. I do think input will (by default) be 1s’ complement, with 0 receiving special treatment, what I am internally referring to as the Magic Number or MNum. MNum would allow some instructions to serve multiple purposes, which could theoretically put us on the path to larger integers at lower word sizes. Additionally, MNum can act as a stand-in for a predetermined value for other instructions, again opening up some integer growth options.


wo: Word size

A few things have occurred to me regarding the WORDSIZE instruction in wo. Namely, that changing word size in the middle of a program could be entirely unpredictable – my first thought would be to break every word out into an array before interpreting (aside from the WORDSIZE instruction itself), but that would be impossible – the stack would necessarily have to be computed before WORDSIZE could execute. Even before this, I was wondering if I should really prioritize it so low in the instruction set, or if it just turns into an abuse of a reduced instruction set claim.


wo: Introduction

I’ve been thinking a lot about languages designed for code golf and programming puzzles/challenges. They run no risk of extinction — seems a new utterly inexplicable dialect is always freshly hatching. And here I am, about to fan these flames with my own bonkers idea. I call it wo1. Here’s the thing – there will be several posts in this series, as I figure out how I would ideally like this thing to work. There will be a lot of theoretical code. But it is very possible that I will never actually bother writing a reference implementation for it. I’d like to think that I will, but I’m really not a great programmer and the fun for me is largely in the thought experiment, the puzzle.

There are a few key points that I think will make this language unique. First and foremost, it’s based on binary instructions with variable word sizes. The interpreter should set the initial word size based on invocation: wo4 would start with 4 bit words, wo7 with 7 bit words, etc. Smaller word sizes mean more instructions per byte, but fewer instructions are available (and smaller numbers can be pushed to the stack). It is postfix, stack-based, influenced by RPN, RPL, Forth, and dc. The main stack is initialized at zero, and can never be empty. I am very seriously considering a stack that grows but does not shrink (aside from clears) – drop operations would duplicate the bottom of the stack like RPN. Word size would be resettable via a relatively low (preferably 3-bit) instruction. A separate stack would handle certain scratch activities like string manipulation. Least Significant Bit (LSB) would determine stack input vs. instruction execution. Nearly every instruction would be rewritable (don’t need GOTO? plop something else in there before dropping your word size and taking care of business). That pretty much sums up the core philosophy/principals, the big remaining question is implementation — primarily, instruction set.


Collatz sequences in dc

Inspired by this recent post and code snippet by John Gruber, I decided to have a go at outputting Collatz conjecture sequences in dc. Running

    dc -e '?[r3*1+]so[d2~1=opd1<x]dsxx'

from the command line will take a single value as input, and run the sequence. ? takes input, puts it on the stack. [r3*1+]so is our macro for n3+1, what we do when we encounter an odd value. [d2~1=opd1<x]dsxx is our main macro, as well as our macro for handling even numbers (n/2). First it duplicates the value on the stack (last iteration) and divides by 2 to return the quotient and the remainder. 1=o uses the remainder to test for oddness and run o if true. Here I do something rather lazy for the sake of concise code: that initial duplication was solely for the sake of oddness. o swaps the top of the stack to get that value back, then multiplies by three and adds one, leaving this value on the stack. Evenness will never run this, and since the test for evenness leaves the outcome for evenness on the stack (what luck!), and the initial duplication is below it. Either way, the bottom of the stack is going to fill up with a lot of garbage, which should be inconsequential unless our sequences become absurdly long. At this point, the top of our stack is the result of the current step, so we print it, duplicate it, and run x if it’s greater than 0. Finally, we duplicate our main macro, save it as x, and then execute the copy we left on the stack.


dc as a code golf language

Code golf is a quirky little game – complete the challenge at hand in your preferred programming language in as few bytes as possible. Whole languages exist just for this purpose, with single-character commands and little regard for whitespace. dc always seemed like a plausible language for these exercises, and I recently attempted a few tasks which I would not ordinarily use dc for, notably 99 Bottles of Beer.