unix · brhfl.com

On Heathcliff and hackish image manipulation

bri hefele, 2021-08-08, shell scripting, software, unix

This should probably just be two posts, but it’s been months since I posted anything and I’m just going to go for it. But if you just want to see me talk about a terrible bodge-job of a shell script, scroll down a bit.

For a while I’ve had this idea to start a Twitter bot that posts a strip made up of a random Heathcliff panel paired with a random Heathcliff caption. There are a few reasons for this, the first of which is that under Peter Gallagher’s tenure, Heathcliff has gotten… weird. Recurring themes include friendly but inexplicable robots, helmets that communicate what their wearer is thinking (maybe?), the Garbage Ape, the magical levitating properties of bubblegum1, the meat tank… the strip has gotten to be a real experience for every possible state of the human mind.

Allocations

bri hefele, 2019-04-09, software, unix

My ‘daily driver’ USB drive gave up the ghost recently, and after having secured a replacement1, it was time for the always-fun task of formatting. I could’ve left things as-is, but the stock partition was FAT32 with 32K block allocations. While not the end of the world, I was really hoping to set the new drive up with smaller block allocations. The previous drive was partitioned with 32K allocations, which wasn’t ideal given that I tend to keep a lot of small files around.

Kakoune

bri hefele, 2018-07-01, software, tui, unix, writing

I’m not writing this post in vim, which is really a rather odd concept for me. I’ve written quite a bit about vim in the past; it has been my most faithful writing companion for many years now. Part of the reason is its portability and POSIX inclusion – it (or its predecessor, vi) is likely already on a given system I’m using, and if it isn’t, I can get it there easily enough. But just as important is the fact that it’s a modal editor, where text manipulation is handled via its own grammar and not a collection of finger-twisting chords. There aren’t really many other modal editors out there, likely because of that first point – if you’re going to put the effort into learning such a thing, you may as well learn the one that’s on every system (and the one with thousands of user-created scripts, and the one where essentially any question imaginable is just a Google away…). So, I was a bit surprised when I learned about Kakoune, a modal editor that simply isn’t vim¹.

Now, I’ve actually written a couple of recent posts in Kakoune so that I could get a decent feel for it, but I have no intention of leaving vim. I don’t know that I would recommend people learn it over vim, for the reasons mentioned in the previous paragraph. Though if those things were inconsequential to a potential user, Kakoune has some very interesting design ideas that I think would be more approachable to a new user. Heck, it even has a Clippy:

~                                                          ╭──╮   ╭───┤nop├────╮
~                                                          │  │   │ do nothing │
~                                                          @  @  ╭╰────────────╯
~                                                          ││ ││ │
~                                                          ││ ││ ╯
~                                                          │╰─╯│
~                                                          ╰───╯
nop          unset-option                                                      █
:nop            content/post/2018-06/kakoune.md 17:1 [+] prompt - client0@[2968]

Here are a few of my takeaways:

There’s no inbuilt window management/splitting, by design, which is something that I understand but do not like. The idea is that different sessions running in different X windows or tmux splits can communicate with each other, so the windowing burden is better left to these systems. This is fine, I guess, on my Linux box with i3. Less great on my Cygwin or Darwin terminals where I don’t particularly want to run a multiplexer just for a text editor. While I often have multiple splits open in vim to bounce content around, I do the rest of my multitasking via simple job control. Lack of windowing also makes it unsuitable for diffing.
There’s no inbuilt file browser, which is also by design. This is technically true of vim also, netrw is a plugin, but it’s there by default. netrw is much-maligned, but when necessary I do like to be able to :e . and find my way around.
Kakoune takes a very long time to start up (well, for a text editor), which is odd since the previous two design decisions are listed alongside the editor’s goal of being snappy.
There’s no inbuilt spellcheck, which makes sense if you’re only targeting coders. But only targeting coders is kind of a dumb choice when minimalist editors are great for distraction-free prose writing, and formats like Markdown, Textile, and Fountain make WYSIWYG a regrettable memory.
Normal mode is always a selection mode, essentially akin to Visual mode in vim. This is a real standout feature that both cleans up the number of modes (compared to vim), and simplifies the language and consequences of an action. In vim, I can make a guess at how many words I need to delete, aiming to under- instead of over-shoot, and end up doing something like 10dw 2dw, or I can step into Visual mode. In Kakoune, 10w automatically selects those ten words, and then 2W adds two more to the selection. I guess this seems trivial, but in practice I feel a lot more grounded as far as knowing what text I’m operating on. And this is after years of using vim. Moving around as normal (with hjkl) is still a selection, it’s just a selection of one character.
The cursor doesn’t change when you switch between normal and insert modes, which may or may not be related to the previous point. Some people seem to think it is, but I don’t think anyone on the dev team has said as much, and it honestly makes no sense – the cursor being a block in normal mode suggests that operations happen on characters in normal mode; being a bar in insert mode suggests that you can’t be on a character, only before or after one. Anyway, I had no idea how much I relied on that until I kept getting mode-lost. Apparently it isn’t being fixed any time soon.
Fortunately, changing the status bar’s color based on mode was not terribly tricky, handled via something Kakoune calls hooks, very similar to autocmds in vim. More awkward, in my opinion – every hook needs a filename regex, so in the case of changing part of the colorscheme based on mode, you have to drop in a pointless .*. At least the format of these commands is always the same, there’s no contextual change. It should also be just as easy to change the cursor color, partially mitigating the previous issue.
In general, configurations are awkward just like the previous point. Much of this is justifiable – for instance, word-wrapping and line numbers aren’t normal options, they’re baked into ‘highlighters,’ which feels super weird coming from vim, but might make sense for some users who, say, want line numbers for code but not prose. I prefer consistency, and even wig out a bit when vim’s help behaves differently than my other files.
Despite the awkwardness, making a colorscheme was fairly straightforward, which is good because defaults are kind of bonkers, and the default colors were not terribly serviceable. On the topic of defaults, I have yet to get Markdown highlighting to work. Everything is just a little bit fidgety. I’m going to play with it a bit more, but will probably post decolletage.kak at some point.
Keybindings rely heavily on Alt instead of Ctrl, which… might be justifiable, because in a standard keyboard configuration, Alt is less awkward to chord than Ctrl. But it goes against all terminal standards, and people have ways of making Ctrl work (*cough* reassign Caps Lock *cough*). If the dev team wants any vim converts, replacing things like Ctrlo with Alt; is just a weird move.
Documentation is severely lacking in comparison with vim, but there is one area that’s considerably better. Remember Clippy up there? There is a ton of inline help and guidance, which works great when you sort of know what you’re doing, but isn’t as helpful as far as discoverability.
Multiple cursors are built in, and alongside the normal-is-always-selection-mode paradigm, it works really well. For a while I used vim-multiple-cursors, which is pretty impressive, but kind of felt like a crutch or a bad habit more than anything else. There was nearly always a native vim way to do what I was trying to accomplish. Kakoune’s multiple cursor system is a native solution, and much like the selection model, actually feels like its helping and guiding you.

I guess there are far more negative points in that list than positives, but the truth is that the positives are really positive. Kakoune has done an incredible job of changing vim paradigms in ways that actually make a lot of sense. It’s a more modern, accessible, streamlined approach to modal editing. Streamlining even justifies several of my complaints – certainly the lack of a file browser, and probably the lack of splitting fall squarely under the Unix philosophy of Do One Thing and Do It Well. I’m going to continue to try to grok Kakoune a bit better, because even in my vim-centric world, I can envision situations where the more direct (yet still modal) interaction model of Kakoune would be incredibly beneficial to my efficiency.

Revisiting my Linux box

bri hefele, 2018-06-05, software, unix

My Mac Pro gave up the ghost last week, so while I wait for that thing to be repaired, I’ve been spending more time on my Lenovo X220 running Ubuntu. While I do use it for writing fairly often, that doesn’t even require me to start X. Using it a bit more full-time essentially means firing up a web browser alongside whatever else I’m doing, which has led to some additional mucking around. For starters, I went ahead and updated the system to 16.04, which (touch wood) went very smoothly as has every Linux upgrade I’ve performed in the past couple of years. This used to be a terrifying prospect.

Updating things meant that the package list in apt also got refreshed, and I was a wee bit shocked to find that Hugo, the platform I use to generate this very blog, was horribly out of date. Onward to their website, and they recommend installing via Snapcraft, which feels like a completely inexplicable reinventing of the package management wheel¹. Snapcraft is supposedly installed with Ubuntu 16.04, but not on a minimal system apparently, so I went and did that myself. Of course it has its own bin/ to track down and add to the ol’ $PATH, but whatever – Hugo was up to date. I think I sudoed a bit recklessly at one point, since some stuff ended up owned by root that shouldn’t have been, but that was an easy enough fix.

I run uzbl as a minimalist web browser, and have Chromium installed for something a bit more full-featured. I decided to install Firefox, since it is far less miserable of a browser than ever, and its keyboard navigation is far better than Chromium’s. Firefox runs well, and definitely fits better into my keyboard-focused setup, but there is one snag: PulseAudio. At some point, the Firefox team decided not to support ALSA directly, and it now relies on PulseAudio exclusively for audio. I can see small projects using PulseAudio as a crutch, but for a major product like Firefox it just feels lazy. PulseAudio is too heavy and battery-hungry, and I will not install it, so for the time being I’m just not watching videos and the like in Firefox. I did stumble upon the apulse project, but so far haven’t had luck with it.

I use i3 as my window manager, and I love it so much – when I’m not using this laptop as a regular machine, I forget how wonderful tiling window managers are. When I move to my cluttered Windows workspace at the office, I miss i3. Of course, I tend to have far more tasks to manage at work, but there’s just something to be said for the minimalist, keyboard-centric approach.

I had some issues with uxterm reporting $TERM as xterm and not xterm-256color, which I sorted out. A nice reminder that fiddling with .Xresources is a colossal pain. I’m used to mounting and unmounting things on darwin, and it took me a while to remember that udisksctl was the utility I was looking for. Either I hadn’t hopped on wireless since upgrading my router², or the Ubuntu upgrade wiped out some settings, but I had to reconnect. wicd-curses is really kind of an ideal manager for wireless, no regrets in having opted for that path. I never got around to getting bluetooth set up, and a cursory glance suggests that there isn’t a curses-based solution out there. What else… oh, SDL is still a miserable exercise.

All in all, this setup still suits a certain subset of my needs very well. Linux seems to be getting less fiddly over time, though I still can’t imagine that the ‘year of desktop Linux’ is any closer to the horizon. I wouldn’t mind living in this environment, though I would still need software that’s only available on Mac/Win (like CC), and the idea of my main computer being a dual-boot that largely keeps me stuck in Windows is a bit of a downer. Perhaps my next experiment will be virtualization under this minimal install.

Separating cd and pushd

bri hefele, 2017-04-23, cli, tips and tricks, unix

While much of this post applies to bash, I am a zsh user and this was written from that standpoint.

One piece of advice that I’ve seen a lot in discussions on really tricking out one’s UNIX (&c.) shell is either setting an alias from cd to pushd or turning on a shell option that accomplishes this¹. Sometimes the plan includes other aliases or functions to move around on the directory stack, and the general sell is that now you have something akin to back/forward buttons in a web browser. This all seems to be based on the false premise that pushd is better than cd, when the reality is that they simply serve different purposes. I think that taking cd out of the picture and throwing everything onto the directory stack greatly reduces the stack’s usefulness. So this strategy simultaneously restricts the user to one paradigm and then makes that paradigm worse.

It’s worth starting from the beginning here. cd changes directories and that’s about it. You start here, tell it to go there, now you’re there. pushd does the same thing, but instead of just launching the previous directory into the ether, it pushes it onto a last in, first out directory stack. pushd is helped by two other commands – popd to pop a directory from the stack, and dirs to view the stack.

% mkdir foo bar baz
% for i in ./*; pushd $i && pushd
% dirs -v
0       ~/test
1       ~/test/foo
2       ~/test/baz
3       ~/test/bar

dirs is a useful enough command, but its -v option makes its output considerably better. The leading number is where a given entry is on the stack, this is globbed with a tilde. ~0 is always the present working directory ($PWD). You’ll see in my little snippet above that in addition to pushding my way into the three directories, I also call pushd on its own, with no arguments. This basically just instructs pushd to flip between ~0 and ~1:

% pushd; dirs -v
0       ~/test/foo
1       ~/test
2       ~/test/baz
3       ~/test/bar

This is very handy when working between two directories, and one reason why I think having a deliberate and curated directory stack is far more useful than every directory you’ve ever cded into. The other big reason is the tilde glob:

% touch ~3/xyzzy
% find .. -name xyzzy
../bar/xyzzy

So the directory stack allows you to do two important things: easily jump between predetermined directories, and easily access predetermined directories. This feels much more like a bookmark situation than a history situation. And while zsh (and presumably bash) has other tricks up its sleeves that let users make easy use of predetermined directories, the directory stack does this very well in a temporary, ad hoc fashion. cd actually gives us one level of history as well, via the variable $OLDPWD, which is set whenever $PWD changes. One can do cd - to jump back to $OLDPWD.

zsh has one more trick up its sleeve when it comes to the directory stack. Using the tilde notation, we can easily change into directories from our stack. But since this is basically just a glob, the shell just evaluates it and says ‘okay, we’re going here now’:

% pushd ~1; dirs -v
0       ~/test
1       ~/test/foo
2       ~/test
3       ~/test/baz
4       ~/test/bar

Doing this can create a lot of redundant entries on the stack, and then we start to get back to the cluttered stack problem that started this whole thing. But the cd and pushd builtins in zsh know another special sort of notation, plus and minus. Plus counts up from zero (and therefore lines up with the numbers used in tilde notation and displayed using dirs -v), whereas minus counts backward from the bottom of the stack. Using this notation with either cd or pushd (it is a feature of these builtins and not a true glob) essentially pops the selected item off of the stack before evaluating it.

% cd +3; dirs -v
0       ~/test/baz
1       ~/test/foo
2       ~/test
3       ~/test/bar
% pushd -0; dirs -v
0       ~/test/bar
1       ~/test/baz
2       ~/test/foo
3       ~/test

…and this pretty much brings the stack concept full circle, and hopefully hits home why it makes far more sense to curate this stack versus automatically populating it whenever you change directories.

Extracting JPEGs from PDFs

bri hefele, 2017-04-18, cli, tips and tricks, unix

I’m not really making a series of ‘things your hex editor is good for’, I swear, but one more use case that comes up frequently enough in my day-to-day life is extracting JPEGs from PDF files. This can be scripted simply enough, but I find doing these things manually from time to time to be a valuable learning experience.

PDF is a heck of a file format, but we really only need to know a few things right now. PDFs are made up of objects, and some of these objects (JPEGs included) are stream objects. Stream objects always have some relevant data stored in a thing called a dictionary, and this includes two bits of data we need to get our JPEG: the Filter tells the viewer how to interpret the stream, and the Length tells us how long, in bytes, the data is. The filter for JPEGs is ‘DCTDecode’, so we can open up a PDF in a hex editor (I’ll be using bvi again) and search for this string to find a JPEG. Before we do, one final thing we should know is that streams begin immediately after an End Of Line (EOL) marker following the word ‘stream’. EOL in a PDF should always be two bytes – 0D 0A or CR LF.

/DCTDecodeEnter

00002E80  6C 74 65 72 2F 44 43 54 44 65 63 6F 64 65 2F 48 lter/DCTDecode/H
00002E90  65 69 67 68 74 20 31 31 39 2F 4C 65 6E 67 74 68 eight 119/Length
00002EA0  20 35 35 33 33 2F 4E 61 6D 65 2F 58 2F 53 75 62  5533/Name/X/Sub
00002EB0  74 79 70 65 2F 49 6D 61 67 65 2F 54 79 70 65 2F type/Image/Type/
00002EC0  58 4F 62 6A 65 63 74 2F 57 69 64 74 68 20 31 32 XObject/Width 12
00002ED0  31 3E 3E 73 74 72 65 61 6D 0D 0A FF D8 FF EE 00 1>>stream.......
/DCTDecode                                     00002E85  \104 0x44  68 'D'

This finds the next ‘DCTDecode’ stream object and puts us on that leading ’D’, byte offset 2E85 (decimal 11909) in this instance. Glancing ahead a bit, we can see that the Length is 5533 bytes. If we then search for ‘stream’, (/streamEnter), we’ll be placed at byte offset 2ED3 (decimal 11987). The word ‘stream’ is 6 bytes, and we need to add an additional 2 bytes for the EOL. This means our JPEG data starts at byte offset 11995 and is 5533 bytes long.

How, then, to extract this data? It may not be everyone’s favorite tool, but dd fits the bill perfectly. It allows us to input a file, start at a byte offset, go to a byte offset, and output the resulting chunk of file – just what we want. Assuming our file is ‘test.pdf,’ we can output ‘test.jpg’ like…

dd bs=1 skip=11995 count=5533 if=test.pdf of=test.jpg

bs=1 sets our block size to 1 byte (which is important, dd is largely used for volume-level operations where blocks are larger). skip skips ahead however many bytes, essentially the initial offset. count tells it how many bytes to read. if and of are input and output files respectively. dd doesn’t follow normal Unix flag conventions, there are no prefixing dashes and those equal signs are quite atypical, and dd is quite powerful, so it’s always worth reading the manpage.

Semaphore and sips redux

bri hefele, 2017-04-04, cli, one-liner, software, unix

In this article, I do sem -j +5, allowing 5 jobs to run at a time. -j can be used with integers, percents, and +/– values such that one can say -j +0 -j -1 to run one fewer job than their available cores (+0), etc.

I was going to simply edit my last post, but this might warrant its own, as it’s really more about sem and parallel than it is sips. parallel’s manpage describes it as ‘a shell tool for executing jobs in parallel using one or more computers’. It’s kind of a better version of xargs, and it is super powerful. The manpage starts early with a recommendation to watch a series of tutorials on YouTube and continues on to example after example after example. It’s intense.

In my previous post, I suggested using sem for easy parallel execution of sips conversions. sem is really just an alias for parallel --semaphore, described by its manpage (yes, it gets its own manpage) as a ‘counting semaphore [that] simply waits for a semaphore to become available and then runs the command given’. It’s a convenient and fairly accessible way to parallelize tasks. Backing up for a second, it does have its own manpage, which focuses on some of the specifics about how it queues things up, how it waits to execute tasks, etc. It does this using toilet metaphors, which is a whole other conversation, but for the most part it’s fairly clear, and it’s what I tend to reference when I’m figuring something out using sem.

In my last post (and in years of converting things this way), I had to decide between automating the cleanup/rm process or parallelizing the sips calls. The problem is, if you do this:

for i in ./*.tif; sem -j +5 sips -s format png "$i" --out "${i/.tif/.png}" && rm "$i"

…the parallelism gets all thrown off. sem executes, cues up sips, presumably exits 0, and then rm destroys the file before sem even gets the chance to spawn sips. None of the files exist, and sips has nothing to convert. The sem manpage doesn’t really address chaining commands in this manner, presumably it would be too difficult to fit into a toilet metaphor. But it occurred to me that I might come up with the answer if I just looked through enough of the examples in the parallel manpage (worth noting that a lot of the parallel syntax is specific to not being run in semaphore mode). The solution is facepalmingly simple: wrap the && in double quotes:

for i in ./*.tif; sem -j +5 sips -s format png "$i" --out "${i/.tif/.png}" "&&" rm "$i"

…which works a charm. We could take this even further and feed the PNGs directly into optipng:

for i in ./*.tif; sem -j +5 sips -s format png "$i" --out "${i/.tif/.png}" "&&" rm "$i" "&&" optipng "${i/.tif/.png}"

…or potentially adding optipng to the sem queue instead:

for i in ./*.tif; sem -j +5 sips -s format png "$i" --out "${i/.tif/.png}" "&&" rm "$i" "&&" sem -j +5 optipng "${i/.tif/.png}"

…I’m really not sure which is better (and I don’t think time will help me since sem technically exits pretty quickly).

Darwin image conversion via sips

bri hefele, 2017-04-02, apple, cli, one-liner, software, unix

I use Lightroom for all of my photo ‘development’ and library management needs. Generally speaking, it is great software. Despite being horribly nonstandard (that is, using nonnative widgets), it is the only example of good UI/UX that I’ve seen out of Adobe in… at least a decade. I’ll be perfectly honest right now: I hate Adobe with a passion otherwise entirely unknown to me. About 85-90% of my professional life is spent in Acrobat Pro, which gets substantially worse every major release. I would guess that around 40% of my be-creative-just-to-keep-my-head-screwed-on time is spent in various pieces of CC (which, subscription model is just one more fuck-you, Adobe). But Lightroom has always been special. I beta tested the first release, and even then I knew… this was the rare excuse for violating so many native UI conventions. This made sense.

Okay, from that rant we come up with: thumbs-down to Adobe, but thumbs-up to Lightroom. But there’s one thing that Lightroom has never opted to solve, despite so many cries, and that is PNG export. Especially with so many photographers (myself included) using flickr, which reencodes TIFFs to JPEGs, but leaves the equally lossless PNG files alone, it is ridiculous that the Lightroom team refuses to incorporate a PNG export plugin. Just one more ’RE: stop making garbage’ memo that I need to forward to the clowns at Adobe.

All of this to just come to my one-liner solution for Mac users… sips is the CLI/Darwin equivalent of the image conversion software that MacOS uses for conversion in Preview, etc. The manpage is available online, conveniently. But my use is very simple – make a bunch of supid TIFFs into PNGs.

for i in ./*.tif ; sips -s format png "$i" --out "${i/tif/png}" && rm "$i"

…is the basic line that I use on a directory full of TIFFs output from Lightroom. Note that this is zsh, and I’m not 100% positive that the variable substitution is valid bash. Lightroom seemingly outputs some gross TIFFs, and sips throws up an error for every file, but still exits 0, and spits out a valid PNG. sips does not do parallelism, so a better way to handle this may be (using semaphore):

for i in ./*.tif; sem -j +5 sips -s format png "$i" --out "${i/tif/png}"

…and then cleaning up the TIFFs afterward (rm ./*.tif). Either way. There’s probably a way to do both using flocks or some such, but I haven’t put much time into that race condition.

At the end of the day, there are plenty of image conversion packages out there (ImageMagick comes to mind), but if you’re on MacOS/Darwin… why not use the builtins if they function? And sips does, in a clean and simple way. While it certainly isn’t a portable solution, it’s worth knowing about for anyone who does image work on a Mac and feels comfortable in the CLI.

Of lynx and curl

bri hefele, 2017-03-29, cli, one-liner, tips and tricks, unix

I use zsh, and some aspects of this article may be zsh specific, particularly the substitution trick. bash has similar ways to achieve these goals, but I won’t be going into anything bash-specific here.

At work, I was recently tasked with archiving several thousand records from a soon-to-be-mercifully-destroyed Lotus Notes database. Why they didn’t simply ask the DBA to do this is beyond me (just kidding, it almost certainly has to do with my time being less valuable, results be damned). No mind, however, as the puzzle was a welcome one, as was the opportunity to exercise my Unix (well, cygwin in this case) chops a bit. The exercise became a simple one once I realized the database had a web server available to me, and that copies of the individual record web views would suffice. A simple pairing of lynx and curl easily got me what I needed, and I realized that I use these two in tandem quite often. Here’s the breakdown:

There are two basic steps to this process: use lynx to generate a list of links, and use curl to download them. There are other means of doing this, particularly when multiple depths need to be spidered. I like the control and safety afforded to me by this two-step process, however, so for situations where it works, it tends to be my go-to. To start, lynx --dump 'http://brhfl.com' will print out a clean, human-readable version of my homepage, with a list of all the links at the bottom, formatted like

1. http://brhfl.com/#content
2. http://brhfl.com/
3. http://brhfl.com/./about/
4. http://brhfl.com/./categories/
5. http://brhfl.com/./post/

…and so on (note to self: those ./ URLs function fine, and web browsers seem to transparently ignore them, but… maybe fix that?). For our purposes, we don’t want the formatted page, nor do we want the reference numbers. awk helps us here: lynx --dump 'http://brhfl.com' | awk '/http/{print $2}' looks for lines containing ‘http’, and only prints the second element in the line (default field separator being a space).

http://brhfl.com/#content
http://brhfl.com/
http://brhfl.com/./about/
http://brhfl.com/./categories/
http://brhfl.com/./post/

…et cetera. For my purposes, I was able to single out only the links to records in my database by matching a second pattern. If we only wanted to return links to my ‘categories’ pages, we could do lynx --dump 'http://brhfl.com' | awk '/http/&&/categories/{print $2}', using a boolean AND to match both patterns.

http://brhfl.com/./categories/
http://brhfl.com/./categories/apple/
http://brhfl.com/./categories/board-games/
http://brhfl.com/./categories/calculator/
http://brhfl.com/./categories/card-games/

…and so on. Belaboring this any further would be more a primer on awk than anything, but it is necessary¹ for turning lynx --dump into a viable list of URLs. While this seems like a clumsy first step, it’s part of the reason I like this two-step approach: my list of URLs is a very real thing that can be reviewed, modified, filtered, &c. before curl ever downloads a byte. All of the above examples print to stdout, so something more like lynx --dump 'http://brhfl.com' | awk '/http/&&/categories/{print $2}' >> categories-urls would (appending to and not clobbering) store my URLs in a file. Then it’s on to curl. for i in $(< categories-urls); curl -O "$i" worked just fine² for my database capture, but our example here would be less than ideal because of the pretty URLs. curl will, in fact, return

curl: Remote file name has no length!

…and stop right there. This is because the -O option simplifies things by saving the local copy of the file with the remote file’s name. If we want to (or need to) name the files ourselves, we use the lowercase -o filename instead. While this would be a great place to learn more about awk³, we can actually cheat a bit here and let the shell help us. zsh has a tail-matching substitution built in, used much like basename to get the tail end of a path. Since URLs are just paths, we can do the same thing here. To test this, we can for i in $(< categories-urls); echo ${i:t}.html and get

categories.html
apple.html
board-games.html
calculator.html
card-games.html

…blah, blah, blah. This seems to work, so all we need to do is plug it in to our curl command, for i in $(< categories-urls); (curl -o "${i:t}".html "$i"; sleep 2). I added the two seconds of sleep when I did my db crawl so that I wasn’t hammering the aging server. I doubt it would have made a difference so long as I wasn’t making all of these requests in parallel, but I had other things to work on while it did its thing anyway.

One more reason I like this approach to grabbing URLs – as we’re pulling things, we can very easily sort out the failed requests using curl -f, which returns a nonzero exit status upon failure. We can use this in tandem with the shell’s boolean OR to build a new list of URLs that have failed: (i="http://brhfl.com/fail"; curl -fo "${i:t}".html "$i" || echo "$i" >> failed-category-urls) gives us…

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404 Not Found
~% < fail.html
zsh: no such file or directory: fail.html
zsh: exit 1      < fail.html
~% < failed-category-urls
http://brhfl.com/fail

…which we can then run through curl again, if we’d like, to get the resulting status codes of these URLs: for i in $(< failed-category-urls); (printf "$i", >> failed-category-status-codes.csv; curl -o /dev/null --location --silent --head --write-out '%{http_code}\n' "$i" >> failed-category-status-codes.csv)⁴. < failed-category-status-codes.csv in this case gives us

http://brhfl.com/fail,404

…which we’re free to do what we want with. Which, in this case, is probably nothing. But it’s a good one-liner anyway.

Making multiple directories with mkdir -p

bri hefele, 2016-10-11, one-liner, tips and tricks, unix

I often have to create a handful of directories under one root directory. mkdir can take multiple arguments, of course, so one can do mkdir -p foo/bar foo/baz or mkdir foo !#:1/bar !#:1/baz (the latter, of course, would make more sense given a root directory with a longer name than ‘foo’). But a little trick that I feel slips past a lot of people is to use .. directory traversal to knock out a bunch of directories all in one pass. Since -p just makes whatever it needs to, and doesn’t care about whether or not any part of the directory you’re passing exists, mkdir -p foo/bar/../baz works to create foo/bar and foo/baz. This works for more complex structures as well, such as…

% mkdir -p top/mid-1/../mid-2/bottom-2/../../mid-3/bottom-3
% tree
.
└── top
    ├── mid-1
    ├── mid-2
    │   └── bottom-2
    └── mid-3
        └── bottom-3

Your Brand New Linux Install (A letter to my future self)

bri hefele, 2016-08-25, unix

Dear Future Self,

If you’re reading this, hopefully it’s because you’re about to embark, once again, on the journey known as ‘installing Linux anew’. You’re predictable, you’re not particularly adventurous, so you’re almost certainly opting for Ubuntu out of some delusion that its consumer-friendly nature will make the install quick and seamless. But you only want the machine for writing/coding on, so you’re going to ruin your chances of a simple install by opting for Minimal. You’ve done it several times before, so it couldn’t be that bad, right? No, it won’t be, but I can tell you… I wish I had a me to guide me through it last time or the time before.

The actual install is simple enough. You may want to research the bundles of packages offered up during install time, I still haven’t figured those out. Also, it’s probably worth reviewing the encryption options, and the current state of the various filesystem choices, though you’re just going to choose whatever unstable thing has the most toys to play with anyway. Just let it do its thing, prepare a martini, relax.

dc as a code golf language

bri hefele, 2016-08-09, code golf, dc, rpn, stack language, unix

Code golf is a quirky little game – complete the challenge at hand in your preferred programming language in as few bytes as possible. Whole languages exist just for this purpose, with single-character commands and little regard for whitespace. dc always seemed like a plausible language for these exercises, and I recently attempted a few tasks which I would not ordinarily use dc for, notably 99 Bottles of Beer.

dc

bri hefele, 2012-01-25, calculator, dc, math, rpn, stack language, unix

This is an old post from an old blog; assets may be missing, links may be broken, and my opinions may differ considerably by this point…

Even though I generally have an HP or two handy, the POSIX command-line RPN calculator, dc, is probably the calculator that I use most often. The manpage is pretty clear on its operation, albeit in a very manpagish way. While the manpage makes for a nice reference, I've not seen a friendly, readable primer available on the program before. This is likely because there aren't really people lining up to use dc, but there are a couple of compelling reasons to get to know it. First, it's a (and in fact, the only calculator, if memory serves) required inclusion in a POSIX-compliant OS. This is important if you're going to be stuck doing calculations on an unknown system. It's also important if you're already comfortable in a postfix environment, as the selection of such calculators can be limiting.

dvtm and the mouse

bri hefele, 2010-08-07, software, unix

This is an old post from an old blog; assets may be missing, links may be broken, and my opinions may differ considerably by this point… Notably, in February 2021, a reader sent in a comment informing me that a PR was submitted to support mouse wheel scrolling in DVTM, and that they’ve patched it into their local environment with success. I haven’t (and won’t, as I rely on job control for multitasking for the past… ten years or so) tested this, so YMMV, but… it’s an update!

I've gotten quite a few hits from people searching for things like 'dvtm pass mouse.' I don't have much to say on the matter, except that this is the one thing that really bugs me about dvtm. As I have mentioned previously, given the choice between screen, tmux, and dvtm, I like dvtm the best. It is certainly the simplest, and has the smallest footprint. It automatically configures spaces, and makes notions of simultasking as simple as double-clicking. I would say that it brings the best of the GUI experience to terminal multiplexing, while still keeping true to the command line.

Job Control

bri hefele, 2010-06-13, cli, unix

This is an old post from an old blog; assets may be missing, links may be broken, and my opinions may differ considerably by this point. Notably, I use zsh as my primary shell these days, which has out-of-the-box support for what I set out to accomplish here (setopt auto_continue).

If you haven't already, it's probably a good idea to read my previous post. The plan, of course, was to work on z, my shell script to assist me with multitasking and Unix job control. I am working on z, but while I was spending a lot of time thinking about z, I was spending just as much time implementing something additional. Two additional things, to be exact. It's worth mentioning again that my shell of choice is fish1, and therefore everything that follows is written for fish.

ep

bri hefele, 2010-05-12, cli, tui, unix

This is an old post from an old blog; assets may be missing, links may be broken, and my opinions may differ considerably by this point…

I spend a good deal of time inside a terminal. Text-based apps are powerful, when you know what you're doing, and fast (also when you know what you're doing, I suppose). If an equivalent Cocoa or X11 GUI tool offers me little advantage, I'm probably going to stick to either a CLI- or TUI-based piece of software. One of the more important, taken-for-granted pieces of the command line environment is that of the pager. Typically, one would use something like more or less for their pager. For a while, I used w3m as my pager, as well as my text-based web browser. Then Snow Leopard came out, and everything from MacPorts got totally jacked up and left much of my system jacked up as well. Parts of it I've fixed, other parts I've been lazy about. For that reason, or perhaps not, I have transitioned to ELinks as my text-based web browser. Today, after recent discussions with a friend regarding w3m and ELinks, I had a thought - why not use ELinks as my pager as well?

dc Syntax for Vim

bri hefele, 2010-05-06, code, dc, math, software, stack language, unix, vim

This is an old post from an old blog; assets may be missing, links may be broken, and my opinions may differ considerably by this point…

I use dc as my primary calculator for day-to-day needs. I use other calculators as well, but I try to largely stick to dc for two reasons - I was raised on postfix (HP 41CX, to be exact) and I'm pretty much guaranteed to find dc on any *nix machine I happen to come across. Recently, however, I've been expanding my horizons, experimenting with dc as a programming environment, something safe and comfortable to use as a mental exercise. All of that is another post for another day, however - right now I want to discuss writing a dc syntax definition for vim.