Hold The Robot Blog

Public Restroom Doors are a Nightmare

Tue, 24 Mar 2026 00:00:00 GMT

Look at this. This is the interior handle and lock of a single occupant public restroom door. Someone spent a lot of time and effort creating what might be the worst lock imaginable.

This is the interior of an airplane restroom. Whoever designed this door understood the assignment.

A restroom door needs to do three things:

Prevent anyone from entering when there's an occupant
Make the occupant feel like no-one will be able to enter (otherwise it's stressful)
Prevent people from having to attempt to enter to find out if there's an occupant

You'd think, considering how many of these damn doors have been built that this would be a thoroughly solved problem, but in my experience it's genuinely rare to find a door that correctly and consistently does all 3.

It was a level 1 failure that motivated this blog post.

A door like this is composed of several systems. It has:

Hinges that allow it to open and close
A latch that prevents it from opening
A knob that disengages the latch
An exterior lock that serves an authorization check (i.e. someone will size you up and decide if you deserve the code to the bathroom)
An interior lock that's operated only by the occupant

Any of these systems could be omitted (except the hinges and interior lock), and they often overlap.

Whoever put up this "look" sign knew there was a problem here, but didn't know what it was

Failures happen because human beings have to operate these systems without necessarily knowing which ones are preset, how they work, or what the state of each system is.

Sometimes the systems are bad, but more often the problem is the signals. That handwritten "look" sign assumes people aren't looking at the locked/unlocked indicator before trying to enter. But the are looking; they're just assuming it's an indicator for the exterior lock, not the internal lock. There are three locks on this door after all. The "look" sign appeared a few weeks before the authorization lock got taped over in what I assume is just further desperation in trying to deal with this bad design.

It's not trivial to signal each of these systems correctly, but there's really only one that matters; the interior lock. And the state of this lock needs to be signalled clearly on both sides of the door.

In 2026, the year of our Lord, humanity has managed to solve half the problem (or at least that's where the U.S. is at. This might be better elsewhere). There is often a sign on the door exterior like this:

Explicitly "there is someone in here" and you can't miss it.

Here's the interior of the same door:

An employee had the sense to tape up a little sign to at least show the direction to turn the lock, but at what point is it locked? How will you know? Can you at least test that it's locked?

I nearly walked in on a little girl using this very restroom. The only thing that saved us both was a slight hesitation on my part after cracking this door (with a giant VACANT sign on the front, mind you) due to some dim memory that this lock catches early, just like the one in the first video. Nearly had to start my day at the coffee shop trying to explain to some furious parent why we don't need to call the police.

There are layers to how bad this design is.

Any slight misalignment in the door causes the lock to catch before it's actually locked.
There's no indication at all if the lock is properly locked. It's locked at "arbitrary degrees turned", and that's knowledge you don't have.
Turning the handle unlocks the lock, so YOU CAN'T EVEN CHECK THAT IT'S LOCKED.

At this point, the vacant/occupied sign is a liability, because it tells the person on the outside "hey there's definitely no one in here, walk right in". If your goal was to create the most evil restroom door possible, you could not have done better than this.

Cheap, low tech, and unfortunately forces the "have to try opening the door to find out if someone is in here". But it won't fail, and it's completely clear to the occupant when it's locked.

I don't know what it is about public restrooms. The doors, the stupid motion-sensing sinks and towel dispensers, and oh god the stalls. It's an essential part of life in public and I don't understand why we don't get it right.

In Software, One Thing is Better Than Two Things

Sun, 14 Dec 2025 00:00:00 GMT

I recently took over a software project that had, from both an engineering perspective and a usability perspective, outright failed. The code was a teetering tower that had collapsed in on itself into a pool of leaked abstractions and interdependent logic.

I've spent many hours wading through the mess, and it's become almost a meditation on what can go wrong in software. The issues with this specific code aren't that interesting; what feels important here are the core values that a developer has to have for all the little microdecisions flow out of.

I think it sits at the bottom of "don't repeat yourself" or "keep it simple stupid" or all the other quippy platitudes. Even "single source of truth" is putting it too narrowly. Never have 2 things when you can have 1. I think this has to be so deeply rooted that it's more an ever present gut feeling than it is some explicitly reasoned rule. Like if you stand on a precarious ledge you don't need to be told to feel anxious.

Here's some shallow examples to try and illustrate the deeper point:

If a SQL table needs a date, it should have a date. Not a Unix timestamp and an ISO 8601 and a colloquially-styled date string and so on. Have one date column and make sure it's correct. Use a functional transform if you need to display something else.
When possible, avoid cache layers. A cache means you have data in two places and now have the non-trivial problem of keeping it in sync. If you can't avoid a cache layer, you at least want to make it feel like one thing. For example, tracking changes to the underlying data and pushing them into the cache layer is vastly better than relying on a timer to expire the cache. If you can avoid states where the cache decouples from the data underneath, you should.
You should prefer composition over inheritance. That's not advice; that's an observation. If you've ever spent time working with a deep inheritance tree, having to implement the same behaviour in multiple child classes or seeing logic sprinkled across a 5 layer deep inheritance chain should feel wrong. Composition is better because it more tightly defines the "one thing" of a behaviour, rather than letting that behaviour become diffused into different places or even outright duplicated.
If data needs to be validated, it should be validated once, at the point it enters the system. If your code is constantly having to call if is_valid(some_data) {...}, then there's no clear contract for what the application can trust. There's "fuzziness" in the system, and that's never good.

For each of these examples, it's easy to think of "hey what about scenario X" or "but there's also consideration Y", and that's totally valid. There are reasons to duplicate data or behaviors, both pragmatic and conceptual. The point is that as you feel the fundamental tension between differing concerns in software design, the principle of "prefer one thing" should pull pretty hard.

The project failed because this principle was missing. Multiple overlapping cache layers made data in -> data out a broken relationship. Repeated code with small permutations meant something was always missed when things were added or changed. Layered, partial checks for error conditions meant errors propagated deep into the code were only sometimes caught. It wasn't some singular critical flaw; it was small compounding errors that multiplied until the whole thing fell under the event horizon.

How to use Claude Code for big tasks without turning your code to shit

Mon, 10 Nov 2025 00:00:00 GMT

I find myself using LLMs for coding in 4 specific ways:

Finding information
Rubber ducking
Generating snippets of code or documentation
Having it work for hours on a big task with minimal intervention

I find the first 3 to be very useful, especially now that web search is dead. But they are similar in that the LLM in on a tight leash, and everything still ultimately flows through my brain. Number 4 is different though; it gives Claude a lot of leeway to go do whatever it wants, without my babysitting.

For a long time I've found the Claude-take-the-wheel style vibe coding to be an incredible waste of my time and sanity. For every "oh wow it worked" moment there were far to many "oh wow I just spent an hour sifting through garbage". Even when it accomplishes the task, the LLM always injects entropy into the code. Do it enough and this eventually ends in an incoherent fever dream of code slop. The software equivalent of "man who is sitting on a couch but somehow also is the couch". I found this video does a nice job demonstrating the anti-memetic nonsense you end up with.

Recently though, I started to worry that I had written it off too early. Some of my friends (who's opinion I trust) seemed to be getting better results, and I can't help but think back to the early days of Google search, where some people seemed to "get it" and others didn't. Clearly it can do impressive things; it's just a matter of

raising the odds of success, and
lowering the risk of wasted time on my part (I am pointedly ignoring the actually cost of token usage for now)

So I committed to a full week of heavy Claude Code usage, and set out to have it solve some major to-do items I had been putting off for months

The specifics don't matter too much here, but for context, some of what I had it do:

Research all the available on-device speech-to-text models with permissive licences
Demo the transcription speed of each one on an android device attached to the PC
Write a C wrapper for the best one (Moonshine) and build an embeddable dynamic library
Build this for iOS, Android, Linux, and macOS, and integrate it with my app code using the FFI
Build a Nim wrapper for the fdk-aac library
Integrate it with miniaudio, so I can play AAC audio and pipe the audio into Moonshine

Plus many other tasks around wiring these things up and getting them running. Collectively I would estimate that these things would have taken me a month, and much of it would have been painful, tedious work.

Despite a rocky start (I nearly gave up on day 1), I ended up very happy with the results. I landed some solid new features, and my code is not shit (at least no more than it was). So here are some of my findings and bits of advice for how to drive this thing.

Every task needs a clear entry and exit point. i.e. "Run the program with ./run_program.sh, and look for 'module loaded successfully' in the log". Don't let it just crawl through the code and decide when it thinks it's done.
Put your time into the setup process and the review process, but not in between. Trying to steer Claude while it's working means you're investing time into an ephemeral state that you will as likely as not throw out later. If it goes wrong, just /clear, update the starting prompt, and go again. Once it goes wrong the context is usually too polluted anyway.
Always protect your own work with source control. That includes the work spent writing a prompt. It should always be trivial to wipe everything out, make some changes, and send Claude off again.
Keep the intersection of your code and Claude's code as minimal as possible. For example, if I want it to write a new miniaudio decoder at aac_decoder.c, that's the only file it's allowed to touch. It might generate lots of tests and docs, but those go into claude_tests and claude_docs, never into the acutal test or docs directories. It might seem unintuitive, since you often do want testing and documentation for a new feature, but those things are first-order tasks that should be worked on directly. If you want tests, toss out all the garbage and have claude write a couple simple tests that you can actually review.
Observe a real result before you even look at the code. If you're working on, say, an image processing feature, check the output image before reviewing anything. Seeing something actually work means you (probably) have correct code, even if it's encased in slop. But if there's no observable result, you're risking your time sifting through code that could be nonsense.
Constrain the context and look for references. For example, the prompt may take the form of a document like this: "Refer to file1.c, file2.c, and project_description.md. The miniaudio source is at ./external/miniaudio. The FDK library is at ./external/fdk-aac. We're going to be writing an integration similar to ./src/opus_decoder.c. The task is to..." The less "wander around and pull random stuff into the context" you can have it do, the better.
Set up minimal test projects. LLMs are pretty good at extracting something out of something else, so use a prompt like "Use as a reference and create a minimal project that demonstrates feature X". Then have it extend feature X without all the extra source clouding up it's context.

Metaphorically, I think about any job given to Claude as having 3 dimensions. There's the breadth of the task (roughly how many lines of code it will touch), the depth of the task (the complexity, the layers of abstraction needed, the decision making involved, etc.), and the time spent working on it. Those three axes define a cube, and the size of the cube is how much entropy I'm shoving into the project. Something like "Update all the imports to use the new source structure" is broad (will touch almost all the files) and potentially long-running, but it's conceptually very simple, so the volume is low. "Simplify the codebase and create clean lines of abstraction" is conceptually deep, broad, and will take a long time. Huge entropy cube. So the idea is to shrink the cube when possible, and to only deal with the section of the cube you need (like aac_decoder.c but not ./aac_decoder_tests and ./aac_decoder_project_milestones).

Ultimately, I walked away from my week of heavy Claude usage without any kind of polarized opinion. It's not a terrifying new intelligence machine on the cusp of AGI, but it's also not useless grift. In certain contexts it's a powerful tool that speeds up software development. It also has the potential to be a huge time sink and can absolutely ruin code. But after putting some time into it, my intuition has gotten much better about when to use it, how to use it, and when to leave it alone. My feeling is that an inexperienced developer is at risk of over using it, but an experienced developer may be at risk of under using it. I was the latter, and I'm very happy to have changed that.

Xcode is the Worst Piece of Professional Software I Have Ever Used

Mon, 22 Sep 2025 00:00:00 GMT

The compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions

This is an error you will see often if you develop SwiftUI in Xcode. Know what it means? It means the compiler has given up, and you're on your own. The error points to a file and function, but the issue could be anywhere in your codebase. It might be a simple syntax error, or it might be code that is "too complex" for the compiler. Hopefully you commit frequently because Xcode has turned into Notepad until you figure it out.

Speaking of git; consider the project file (myProject.xcodeproj/project.pbxproj). This file contains all of your project settings, build configs, file references, signing configs, and anything else you can think of. If there are any errors in it, your project simply won't open. It is thousands of lines long and not human readable. Here's a small sample:

7A226CEB2D722B3C001539F8 /* PBXContainerItemProxy */ = {
	isa = PBXContainerItemProxy;
	containerPortal = 7A226C922D722973001539F8 /* Pods.xcodeproj */;
	proxyType = 2;
	remoteGlobalIDString = E826FA0DCB9AA6E7829C68391B323B78;
	remoteInfo = "GTMSessionFetcher-GTMSessionFetcher_Core_Privacy";
};

I won't explain what that means, because I don't know what it means. A merge conflict is exactly as miserable as it sounds. The semi-sane way to deal with it is to use something like xcodegen to rewrite the project settings in a normal-ass file type like yaml and then use that to generate the Xcode project files. BTW the entire UI layer for UIKit apps is stored in unreadable files like this. Imagine.

Take a look at this dialog box. Notice that weirdly dark drop shadow behind it? That's not a UI glitch; that's dialog boxes stacked on top of each other, each waiting for your admin password. You'll know you're close to done when the drop shadow starts to lighten.

Software has bugs and design flaws. I'm not trying to say Xcode sucks because it's buggy (although I'd like to emphasise that it is very buggy). It sucks because it pretends it isn't. Look back at that first error: unable to type-check this expression in reasonable time; try breaking up the expression It's not a bug, it didn't crash, it just... you know. Taking a while. Try wasting your time refactoring your code without knowing where the problem is or having the help of a compiler.

Suppose you're testing out in-app purchases (God help you). You follow Apple's docs, create a sandbox account for testing, and then open the simulator. Apple says the sandbox account will appear in the phone settings after running the app, but of course it doesn't. You attempt to manually sign in and see this:

Okay probably a mistyped password. Try again. Try 10 more times. Maybe you shouldn't have skipped the 2FA setup for this test account? You set up 2FA. Still nothing. You open the Xcode debugger and find Password reuse not available for account. The account state does not support password reuse. WTF is this? You're not reusing a password. You start to wonder if maybe this doesn't work in the simulator, even though Apple's docs make no mention of this. You search around on the developer forums. People confirm that this definitely does not work in the simulator. Other people confirm that it definitely does. There's no answers to be found, and no solid info anywhere.

As a developer, you learn that you simply cannot trust Apple. There is a persistent layer of vagueness and misdirection around every part of the experience. All those WWDC videos showing off new features and frameworks? You know, the ones where the presenter is seemingly going to be shot if they don't hit the adjective quota? Those are basically ads. Watching a presentation on the SwiftUI preview feature (a way to see your UI update without a full app rebuild. Not to be confused with the actually useful hot reloading in Flutter or a web based framework) and then trying to actually use it was pure comedy. There's been years of steady improvement and last I checked it was still mostly useless for any sufficiently complex project. So imagine how bad it was at launch. Not a hint of this at WWDC though. Just a seemingly complete, cutting edge new feature that Apple is so excited to tell you about.

Here's the kicker. Something I cannot and will not get over. Apple's bug tracker is private. You can submit a bug (which has been euphemized to "radar"), but the bug reporter is a black hole; information goes in but it doesn't come out. Starting to wonder if some weird behavior with navigation is maybe not actually a problem on your end? Expecting to search through some Github issues-like tracker to see if anyone is experiencing something similar? Sorry. Better that thousands of developers waste their time rediscovering some subtle framework bug than for Apple to publicly acknowledge a flaw.

It isn't just opaque issues and errors. The design of Xcode and everything around it is stifling. There are currently no real alternative editors if you're working on an iOS project and want things like linting and code completion (it is actually possible with neovim using xcode-build-server, but it's pretty flaky). Jetbrains' AppCode was killed off a few years ago. CLI tools are poorly documented and difficult to use, which means simply trying to script basic things or do CI is painful. Fastlane helps, but it's ridiculous that there needs to be a big Google-funded open source project to just make scripting tolerable. This means Claude Code will struggle to do anything useful too BTW (even more than it already does). And of course it goes without saying that you must be doing all of this on a mac in the first place.

I actually learned software development in Xcode, back before automatic reference counting was even a twinkle in Objective-C's eye. I honestly believe that it hurt my growth as a developer and gave me a poor set of instincts. Rather than reacting to a problem by seeking to go deeper and understand what is happening underneath the code I'm writing, the solutions were mindless and ritualistic. "Try restarting Xcode. Try clearing the derived data. Try rebooting your mac. Try the Xcode beta branch. Try recreating the project." Of course I could have been more mindful and deliberate, but it's hard to know what bad thought patterns you're picking up when the environment is working against you.

I wish the developer experience was better, but Apple does not appear to want to address their technical debt, and developers were always second class citizens anyway. I would encourage any new developers to try and stay away from Xcode (to the extent that you can), and if you are using it and questioning your own sanity thinking "am I just holding it wrong?": no, you're not. Xcode sucks.

Heredocs Can Make Your Bash Scripts Self-Documenting

Fri, 25 Jul 2025 00:00:00 GMT

I have long since come to appreciate the value of writing scripts to avoid someone else (or future me) from having to re-learn and re-solve problems, but something about it has always bugged me.

I am automating a process, but I'm also documenting it, and those two things struggle to coexist.

One option is to write a bash script for the automation and a markdown file for the documentation, but they inevitably end up duplicating information and/or getting out of sync. The other is to just have a single markdown file with a bunch of inline bash that you manually copy into a terminal. But "running" it is clunky, tedious, and easy to mess up.

I tend to prefer the latter despite the annoyances, because "keeping information in sync" is such a big problem. But recently I've been playing with a third option. Rather than maintaining two files or putting bash in markdown; put markdown in bash.

It looks like this (I'm showing you an image since Docusaurus won't syntax highlight this):

This is just a bash script that can be executed like normal.

The markdown bit is a "heredoc", which is basically just a multiline string, similar to a triple-quoted string in python. The <<'delimiter' starts the string and delimiter ends it. Be careful to quote the first delimiter, otherwise you'll get parameter expansion (things like $HOME will expand to /home/myusername) or even execution in your doc strings (intuitive as always, thanks Bash). I chose -md- as a delimiter, but you can choose whatever you like, as long as it's not a string you're going to be using otherwise.

If you precede there heredoc with cat it will print to the terminal when you run the script, but you can also leave that out.

I use the vim plugin preservim/vim-markdown to get markdown syntax highlighting, concealment, links, and so on. By default, none of that is going to work inside a bash script, but you can fix that by adding the following to .config/nvim/after/syntax/sh.vim (create the file and path if needed):

syntax region shMarkdown 
    \ matchgroup=shMarkdownDelimiter 
    \ start=/<<'-md-'\s*$/ 
    \ end=/^-md-\s*$/ 
    \ contains=@markdownHighlight 
    \ containedin=shHereDoc,shHereString
    \ keepend

syntax include @markdownHighlight syntax/markdown.vim

" Link the delimiter to Comment so it's greyed out
highlight link shMarkdownDelimiter Comment

And there you go; markdown-ified bash scripting.

There's still plenty of times a markdown file makes more sense, since you're not always writing bash commands that are intended to be run top-to-bottom. I have a file that lists various ffmpeg commands, for example, and I'm only ever going to be copy-pasting things out of that file. But for a runbook style script I really quite like this and I think it's absolutely a better option than maintaining separate scripts and documentation. There's a reason why so many modern codebases use inline documentation, and I think bash scripts should do the same.

A CRDT-based Messenger in 12 Lines of Bash Using a Synced Folder

Wed, 25 Jun 2025 00:00:00 GMT

mkdir -p $(dirname $0)/data; cd data

print_messages() {
    clear
    cat $(ls -tr | tail -n30)
    printf "\033[31m$USER:\033[0m "
}
export -f print_messages

watchexec 2> /dev/null -- bash -c "print_messages" &

while read text; do
  printf "\033[31m$USER:\033[0m $text\n\n" > "$(uuidgen)"
done

Put that script inside a folder, share the folder with someone via Syncthing or Dropbox or whatever, run it, and you should get this:

This is hardly a Discord killer, but as far as messengers go there are some interesting properties:

There is no central authority or server that "owns" the messages
An offline machine can write new messages that will propagate once it's back online
All participating machines will show the same messages in the same order once they're synced, no matter what

There's nothing really novel about those three things; that's what you get out of the box with Conflict Free Replicated Data Types (CRDTs). So my goal with this blog post is to plant the seed in your mind that CRDTs are just generally cool, and they are very simple.

And even though this little messenger is kinda toy-ish, it's completely solid and I use it to communicate with a (equally nerdy) friend of mine. I've used the same technique to create a time tracker that I can use on different machines without every worrying about being online or things getting out of sync. We're obviously relying on a file sync program to do some heavy lifting here, but because the data is "conflict free", something as simple as rsync or scp would work (and always work) just fine.

The Bash Script

There's not much to it, so I'll run through it quick.

mkdir -p $(dirname $0)/data; cd data

Create a data directory (if needed) and move into it.

print_messages() {
    clear
    cat $(ls -tr | tail -n30)
    printf "\033[31m$USER:\033[0m "
}

Clear the screen, print the contents of the last 30 messages, and then print the mike: prompt. The gross looking \033[31m stuff is just ANSI escape codes to set and unset the color.

export -f print_messages

Some Bash nonsense to "export a function". Otherwise the watchexec subprocess can't see it.

watchexec 2> /dev/null -- bash -c "print_messages" &

Start up watchexec. Send it's stderr output into /dev/null so it doesn't bother us. Whenever it sees file changes, reprint the messages. The & symbol makes it run in the background so our script can do other things.

BTW, I used watchexec to watch for file changes because it works on Termux, which lets me use this on Android. If you want to use fswatch (which seems more common) instead, replace that line with this:

print_messages # fswatch doesn't fire on startup, so print messages first
fswatch -o . | while read -r event; do
  print_messages
done &

while read text; do
  printf "\033[31m$USER:\033[0m $text\n\n" > "$(uuidgen)"
done

Read user input. When they hit Enter, put whatever they wrote into a file. Critically, use a Universally Unique Identifier for that file.

So basically, stuff messages into files that all have UUIDs. If you look inside data you'll see this:

$ ls data
035aa216-3e23-4921-8d14-b79bdc150232  5d07ed32-8f0c-4c88-9a93-f12606d57ea1
04455d8b-b58a-40da-a01a-7631e90ccbd8  6187df26-8a6e-4729-9553-ffe1acf0d45f
1d621700-322e-4ba9-9d66-16a739838adf  650b8c73-9ef5-40e5-8014-a8d97d617f1f

Why This Works

Using UUIDs means that any machine can create files without having to worry about another machine creating an identically-named file and causing a conflict, which would break our whole system.

We can't delete files, because if a machine deletes a file and then tries to sync, it won't be able to tell if it deleted something or if it's just talking to a machine that has a new file (we actually could get away with this because Syncthing keeps a local database to log file deletions, but that's cheating, and simpler tools like scp definitely don't. Plus there's better ways anyway).

Lastly, after two machines exchange files, it's critical that they both can display messages in the same order. Using ls -tr to order the files actually works perfectly, because -tr (order by time, reversed) uses the file modification date, and that gets preserved when copying the files. It's technically possible to create files with the same modification date on two different machines and therefore have an arbitrary ordering, but at least on Linux with most modern filesystems you get billionth-of-a-second granularity, which is more than fine. On a filesystem like FAT32 with 2 second granularity this would very much be a problem.

So, those 3 properties mean that we have created a CRDT. CRDTs are just data structures that:

Can be replicated across an arbitrary number of nodes
Can be modified concurrently
Will always converge to the same thing after nodes sync with each other

Specifically, we've created a grow-only set. If we ignored the contents of each file we could still count them with something like ls -1 | wc -l, and that would be an even simpler CRDT called a grow-only counter.

That's what I used in the timer-tracker thing I mentioned earlier. Just add a file with a UUID into a directory called 25_minute_pomodoros, and now you have a distributed, conflict-free pomodoro counter.

Edits and Deletions

So an obvious problem is that you can't edit or delete a message. And, in fact, it's fundamental to the design that once you create a new file, you absolutely do not mess with it.

To get around that, you just create more files. So in the Pomodoro example, there's a folder called 25_minute_pomodoros_deletions. If I decide that I want to decrement my Pomodoro counter, just touch 25_minute_pomodoros_deletions/$(uuidgen). Then subtract the number of files in 25_minute_pomodoros_deletions from 25_minute_pomodoros. This is called a positive-negative counter.

For messages, rather than just putting the plain text contents in each file, we could do more structured data like:

message:Hey it's mike what's up?

delete:2880dbc8-a2c6-43c0-8f88-e0fb2672755c

edit:2880dbc8-a2c6-43c0-8f88-e0fb2672755c:Hey, it's Mike what's up???

We'd then have to actually inspect the contents of each messsage and decide if it should be displayed or if it affects a previous message (so we're well beyond 12 lines of bash at this point) but it doesn't change anything about the properties of the system. Any machine can make those changes freely, and messages will always be rendered the same way.

The Takeaway

The important concept here is that data is stored in one of these very simple CRDT models, and you can use that basic model to deterministically "render" whatever data you want.

Flat files and uuidgen is enough to implement the data structure (not saying you should, but cleary you can). The sync part is what's mind blowing. You can sync arbitrarily complex data between an arbitrary number of devices without knowing anything about it. rsync or scp could easily handle this job.

If we were doing the same thing in a more sane way (like, say, storing these messages in a local sqlite database), you can still pump messages between machines without any care for what's in them or what they mean.

Even if you want a dedicated server, the server does not need to know how to render them, so the entirely of the server logic can be: Hey, let's compare messages. Please give me the ones that I'm missing. Here's the ones that you're missing. And you have one endpoint: /sync.

I've been building things with CRDTs for a while now and have developed a real love for what they let you do. I'd love to talk more about them soon, but for now, I hope that's at least a fun introduction for anyone who isn't familiar yet. I really think they're being slept on and I hope more people start using them.

A Little Note: If you actually want to play with this and you're using Syncthing, messages are kinda slow by default. There's a setting in ~/.config/syncthing/config.xml called fsWatcherDelayS. Set it to "1" for the folder you're keeping messages in and it will be much faster. If you're using Google Drive or Dropbox or whatever, you're on your own.

Coding Without a Laptop - Two Weeks with AR Glasses and Linux on Android

Sun, 11 May 2025 00:00:00 GMT

I recently learned something that blew my mind; you can run a full desktop Linux environment on your phone.

Not some clunky virtual machine and not an outright OS replacement like Ubuntu Touch or postmarketOS. Just native arm64 binaries running inside a little chroot container on Android. Check it out:

i3, picom, polybar, firefox, and htop

That's a graphical environment via X11 with real window management and compositing, Firefox comfortably playing YouTube (including working audio), and a status bar with system stats. It launches in less than a second and feels snappy.

Ignoring the details of getting this to work for the moment, the obvious response is "okay yeah that's neat but like, why". And fair enough. It's novel, but surely not useful.

Thing is, I had a 2 week trip coming up where I'd need to work, and I got a little obsessed with the idea that I could somehow leave my laptop at home and just use my phone. So what if we add a folding keyboard and some AR glasses?

Here's a CRDT-based ebook/audiobook reader I've been working on, running a desktop Linux app and connected to the Flutter debugger.

What's kind of amazing here is that both the glasses and the keyboard fit comfortably in my pockets. And I'm already carrying the phone, so it's not that much extra.

The Hardware

Keyboard: There's plenty of little folding bluetooth keyboards on the market, and I only had to go through 5 of them before I found one that was tolerable. I tried some with trackpads, but they were either too big or the keys were squeezed together to make it fit. The Termux:X11 app that displays the graphical environment is able to function as a trackpad to move a mouse pointer around, and that turned out to be good enough for mouse input. I'm very keyboard-centric anyway, so I'd often go for a while without needing to touch it.

The Glasses: Believe it or not, "augmented reality" glasses are kinda good now. The AR part is almost entirely a misnomer; they're just tiny little OLED displays strapped to your face attached to bird bath optics. I was able to get a lightly used pair of Xreal Air 2 Pros off of ebay that would show me a 1080p display with a 46° field of view. Some of the newer ones can do large virtual displays rather than the pinned-to-your-head image that mine have, but I'm pretty skeptical of that setup, at least until the resolution and field of view improve.

The Phone: I unfortunately had to upgrade my phone, because to drive the glasses you need to have DisplayPort Alt mode. My very cheap, very crappy old phone did not. The 8 series seems to be the first Pixel phone where Google decided to be marginally less evil and not lock out the DP Alt Mode feature in software (forcing people to buy Chromecasts? IDK), so I bought a used Pixel 8 Pro on ebay.

So the whole setup:

Used Pixel 8 Pro $350
Used Xreal Air 2 Pro - $260
Samers Foldable Keyboard $18

Total cost: $636. Although I'm not sure the $350 for the phone should count, because I really did need a new one.

After a few afternoons experimenting, I felt like I could probably function with only this setup for the two weeks. I figured the full commit would keep me from reverting back to a PC when I hit a wall and got frustrated or bored.

The Result

So after using this on an airplane, in coffee shops, at various family member's houses, in parks, and even sitting in the car, I think I have some answers for "why would you use this when laptops exist and are excellent".

It really does fit into your pockets. No bag, nothing to carry.
I can use it outdoors in bright sunlight. I wrote most of this blog post sitting at a picnic table in a park. Screen glare and brightness is not an issue.
I can fit into tight spaces. This setup was infinitely more comfortable than a laptop when on a plane. Some coffee shops also have narrow bars that are too small for a laptop, but not for this.
The phone has a cellular connection, so I'm not tied to wifi.

In other words, there's a sense of freedom that you do not get with a laptop. And I can be outdoors. One of the things I've grown tired of as software dev is feeling like I'm stuck inside all the time in front of a screen. With this I can walk to a coffee shop and work for an hour or two, then get up and walk to a park for another hour of work. It feels like a breath of fresh air, quite literally.

That said, there were plenty of pain points and nuances to the whole thing. So here's my experience:

The Linux Environment

Linux-on-Android was eventually great, but I don't want to gloss over the fact that it was a pain in the ass to figure out. My definition of "sufficiently capable" was Neovim + functioning langauge servers (Nim, Python, Dart, JS), Node, and Flutter (compiling to both desktop and web apps that could be run and debugged).

The I won't go though everything line-by-line here (I can though, if anyone is interested), but there's already some great resources out there (linked below). Here's the high level picture, based on my learnings.

There's roughly 4 different approaches to Linux on Android:

A virtual machine emulating x86_64
Termux, which is an Android app that provides a mix of terminal emulator, lightweight Linux userland, and set of packages that are able to run in that environment.
arm64 binaries running in chroot, which is basically just a directory where those programs will run, sealed off from the rest of the filesystem. Notably, it requires the system to be rooted.
proot. Same idea as chroot, but doesn't use the forbidden system calls that chroot needs root for

After way too much time spent experimenting, I landed on the chroot approach. I really didn't want to root the phone, but nothing else did what I needed. The virtual machine was way too slow and clunky, as was proot. Sticking to what can be run inside Termux got me surpisingly far, but Android's C implementation is Bionic and most programs won't run unless they're compiled with that in mind. That, plus other differences in the environment mean you're pretty limited. Chroot has no performance penalty as far as I can tell, and (for the most part), anything that can be compiled for arm64 seemed to work.

As far as distro (I tried many), here's what matters:

Small and light. This is a phone, after all.
Has to support aarch64, obviously.
Doesn't use systemd (I could never make it work inside chroot, and it's unclear if it's possible).
Has some amount of testing or support for running in chroot. Arch Linux ARM, for example, had some odd issues here, like fakeroot not working.
Uses glibc. I thought Alpine was going to be the ticket, but I really needed Flutter/Dart to work, and I couldn't get it working with musl. This might not be a problem for everyone though.

So ultimately, the aarch64 glibc rootfs tarball of Void Linux fit the bill, and it's been running beautifully.

I used i3 (a keyboard-centric tiling window manager), but I tested xfce and that worked fine too.

Some usleful links:

The AR Glasses

The quality of the image on these things is fantastic. You're seeing bright pixels from a beatiful OLED display. But because each pixel is bounced off the lens, a black pixel just looks clear. So a black terminal background with white text means you're seeing white text floating in space. This is actually pretty cool if you want "less screen, more world around you" kind of feel, but can also be distracting. However, the model I bought has electrochromic dimming, so you can darken the actual "sunglasses" part to block out ambient light. Without this they'd be unuseable in bright sunlight as the image washes out, so I highly recommend getting a pair that has this.

It's apparently impossible to get a good through-the-lens photo, but trust me that the image through the glasses is excellent. This is wihout the electrochromic dimming turned on, so text just floats in front of the scenery. You can darken the glasses to the point where you can hardly see through them if you want.

I do feel a little weird wearing these in public, but not that weird. They more or less pass for sunglasses, so the odd part is wearing sunglasses indoors and typing on a keyboard with nothing in front of you. I had couple people ask me about them, but they seemed to just think they were cool. One guy said he was going to buy a pair. That may be selection bias though; I'm sure some people thought I was an idiot.

The biggest downside of the glasses is that the FOV is actually too big. Seeing the top and bottom edges of the screen means moving your eyeballs to angles that are just a little uncomfortable, and it's actually difficult to get the lenses in the right spot so that both are clearly in focus at the same time. I had the window manager add some extra padding at the top and bottom of the screen, and that helped quite a bit.

Worth mentioning: I tried to get multi-display mode working on Android, and it was awful. I ended up using this app to change the phone's resolution to 1080p, and then just mirror to the glasses. It turned out to be great, because you can pull the glasses off and just work on the phone whenever you want a break.

The focal plane of the glasses is about 10 feet. Which means if you use readers for a laptop, you probably won't need them.

The Keyboard

Sigh. Can someone please make a good folding keyboard? This little $18 piece of plastic is decent for what it is, but this was the weakest part of the whole setup, and it feels like it should be the easiest. It feels cheap, is bulkier than it needs to be, doesn't lock when it's open (which means you can't really sit with it in your lap), and there's no firmware based key remapping.

I might continue to play alibaba roulette and see if there's a better one out there. But I would quite literally pay 10 times as much for something good.

Performance

As a rough benchmark, I tried compiling Nim from source.

On my Framework 13 with a Core Ultra 5 125H it took 4:15.
On my Thinkpad T450s with an Intel Core i5-5300U it took 14:20.
On the Pixel 8 Pro it took 11:20.

I would say qualitatively that's about how it feels to use. Faster than the Thinkpad, but definitely not as fast as the Framework.

BTW I am glad I paid a little extra for the Pixel 8 Pro, because the 12GB of RAM it has vs the 8 of the non-pro model seems worthwhile. RAM usage often gets close to that 12GB ceiling.

Battery Life

With the glasses on and the phone screen dimmed, the phone used a little under 3 watts at idle, and 5 to 10 when compiling or doing heavier things. On average I'd drain about 15% battery per hour. So 4 to 5 hours before you need to be thinking about charging, but I'm not sure you'd want to have the glasses on longer than that anyway.

Am I Going to Keep Using This?

I'm safely out of the novelty phase at this point, and incredibly, I think the answer is yes. If I had my laptop with me I would never reach for the phone, in the same way that if I'm sitting next to my desktop PC, I'm not going to grab my laptop. But this phone setup can go places that the laptop can't, and that freedom is something I've been wanting for a long time, even if I didn't quite realize it.

I also find it amazing that the whole thing was relatively cheap, especially when compared to something like the Apple Vision Pro. Which, funnily enough, can't do any of what I ended up caring about. It can't fit in your pockets, and it's no more capable of "real" computing than an iPhone. I guess you can use it outdoors, but your eyes are in a sealed box, so I don't think that even counts.

I think there might actually be a future for ultra-mobile software development. Especially as these AR glasses continue to improve and Linux continues to be flexible and awesome. Despite the rough edges, I'm able to go places and do things now that I couldn't do before, and I'm exited about it.

Hold The Robot Blog

Public Restroom Doors are a Nightmare

In Software, One Thing is Better Than Two Things

How to use Claude Code for big tasks without turning your code to shit

Xcode is the Worst Piece of Professional Software I Have Ever Used

Heredocs Can Make Your Bash Scripts Self-Documenting

A CRDT-based Messenger in 12 Lines of Bash Using a Synced Folder

The Bash Script​

Why This Works​

Edits and Deletions​

The Takeaway​

Coding Without a Laptop - Two Weeks with AR Glasses and Linux on Android

The Hardware​

The Result​

The Linux Environment​

The AR Glasses​

The Keyboard​

Performance​

Battery Life​

Am I Going to Keep Using This?​