fear and networking

This past Friday, as I was starting work for the day, I noticed that there were updates available for the various Apple OSen. I decided to go ahead and update my desktop Mac Mini and my work MBP first thing, while I finished planning out my day and reading the overnight email on my personal MBP. The updates applied without incident (seemingly), and my next task was to use my desktop to access several of the computers I use every day, to complete a weekly maintenance ritual: applying other software updates, making sure Git repos were clean and up-to-date, and so on — my periodic rote task of pushing back on entropy on these particular computers.

…and that’s when things went off the rails.

The way I carry out this multi-computer upgrade process is, I SSH from one machine out to the other four (the 5 machines here being the aforementioned desktop, personal laptop, and work laptop, plus a local Linux server and a remote Linux box) and then run various commands on each, in a tmux session per host.

The record scratch moment on Friday was that I was suddenly unable to SSH from my desktop to either my personal laptop or the work-issued one — but I could still SSH to both the local and the remote Linux boxes, so it clearly wasn’t an issue with SSH, or connectivity. Worse, the error message that I got was pretty much completely information-free:

ssh: connect to host bumfit port 22: Undefined error: 0

(Yes, my personal laptop is named bumfit, it’s a sheep counting word, means “15 sheep”.)

I swore a little bit, then grabbed the work laptop, and was surprised to discover that I didn’t have any problem SSHing from there to the desktop machine! SSHing from my personal laptop (which hadn’t had the update applied yet) also worked. Further, I could SSH from one laptop to the other, and vice versa.

Things got progressively weirder, as I figured out that, while I couldn’t SSH from the desktop to either laptop, if I started on either laptop, not only could I SSH into the desktop, from that session I could SSH back out to the laptop without problem!

I spent a couple hours disabling and enabling firewalls, staring at routing tables, staring at ARP tables, rebooting the desktop (a couple times), and generally venting various versions of “WHAT THE FUCK IS HAPPENING” into a friend Discord channel called #yelling. Things eventually got to the point where rebooting all the network hardware seemed like a reasonable troubleshooting step — I mean, can’t hurt and maybe some cache on a switch got poisoned or something, right? — so I did that and went out to lunch. (Yes, I spent the entire morning trying to figure this out…)

When I got back from lunch, I also realized that ping was exhibiting the same behavior as SSH: I couldn’t ping either laptop from the desktop, but both laptops could ping the desktop. Fortunately, ping gave the much more informative error message of

ping: sendto: No route to host

Based on this, and some additional desperate web searching (I mean, I’d been doing this all along, but fruitlessly up to this point), I eventually found somebody having a very similar problem in VS Code terminals — and somebody suggesting to check the recently added “Privacy & Security” setting of “Local Network” …and that was when I discovered that somewhere along the line, local network access had been denied for iTerm. Flipping that switch back to “on” completely resolved the problem …around 3pm Friday afternoon; yes, I spent most of a day figuring this out.

I’m not sure how that setting got turned on — I don’t recall setting it, but I know I’ve seen the dialog box for it for a few programs at this point, and the way it’s worded makes me think I should deny the access — so it’s entirely possible I turned it off without realizing it.

After thinking about it for a bit, I could understand the asymmetric behavior too — the way this new security feature seems to work is by dynamically altering the routing table for programs and their children — so, since network access was denied for iTerm, the ssh and ping commands running as grand-children of the terminal couldn’t see a route to this destination, but if I SSH’d in from the laptop to the desktop, the sshd process that was the parent of my shell had no such restriction, and could communicate with the destination hosts just fine.

I even, eventually, figured out why I was still able to SSH to the local Linux host from the desktop — because the Linux host runs BIND for my local network and gets set as the DNS server for the desktop, the “Privacy & Security” setting must have been smart enough to realize it shouldn’t block access to that, and since the blocking works on a routing level, that meant network access to that host was generally not blocked.

I’m writing this up in the hopes that the next person that runs into this odd networking problem, with the unhelpful SSH message, will be able to find this in a web search and will not have to spend most of a workday on it.