The “Small Brain vs. Big Brain” Problem in Autonomous Robotics
Today, I was reading this: https://x.com/jsrailton/status/2064661778978533571
In a nutshell: sneaky malware developers add forbidden language to their payload to trip keyword-based, hardcoded LLM safety mechanisms before the model can even inspect the malware.
We have had a similar problem in autonomous robotics for a decade.
The general issue is this: you build a smart end-to-end AI robot. Inputs are basically raw sensors, outputs are basically raw actuation. You train it in an AI gym plus real-world data and tada, you have a dancing robot or a self-driving car.
Then come the annoying people asking how to make it safe.
So you hire real safety specialists, the ISO 26262 / DO-178C / IEC 62304 types, and they apply their hard-and-fast rule: just fully specify a safety monitoring system and have it check the end-to-end system in real time.
By now you probably see the parallel with malware: every time you try to second-guess a higher-order reasoning system with simple hardcoded rules, your system becomes exactly as performant as your set of hardcoded rules.
The conversation usually starts like this:
Safety people: “We will get the output of the inference and push it through a trajectory checker that will get the predicted position of the various actors and…”
Me, cutting them off immediately: “This won’t work.”
It is simple: if the “small brain” in your system can understand and check what the “big brain” is doing, why did we build the expensive big brain in the first place?
Just let the small brain control the entire robot. Problem solved.
The only way a small brain can maybe make your big brain somewhat safer is by relying on certainty from the laws of physics over a very short time horizon, where basically no reasoning or scene understanding occurs.
Example: in the next second, there is no physically possible way we avoid hitting whatever is directly in front of us.
Note: this is exactly why some roboticists are adamant that purely camera-based perception will never be enough for safety: you only have derived measurements, not direct physical measurements of the world.
But even then, this is more of a mitigation layer than a real safety system. It is reactive. It is already late when it acts. And it still acts blindly. You might minimize the hit against a pebble in front of you while an 18-wheeler that cannot stop rams into you from behind.
This is why the only real way to build a safety case for an autonomous robot is to diligently cover your operational domain with actual logs that you can verify against.
And the only way to make that work is to have a deterministic runtime, so verification is consistent, repeatable, and meaningful.
copper-rs enters the chat.