Where the Milliseconds Go
A useful exercise before optimizing anything: write down the budget. For a closed-loop manipulation step we target a control rate that the impedance controller can comfortably consume. The budget breaks down into camera capture, image preprocessing, perception inference, policy inference, action post-processing, and transport to the motor controller.
Each of these is bounded individually, not as an aggregate, so a single slow stage can't silently eat into the next. When a stage misses its budget, the watchdog (below) takes over.
A Safety Classifier in Parallel
The main policy is a multi-modal transformer. The safety classifier is a small, fast network whose only job is to answer one question: is there a hand, or any unexpected human body part, inside the cabin work envelope?
It runs on a separate stream, on a separate execution context, at a higher rate than the main policy. Its veto goes directly to the controller and can preempt any in-flight motion. This is deliberately redundant with the hardware safety system; defense in depth matters when the answer to 'is a person there?' has to be wrong essentially never.
INT8 for Perception, FP16 for the Head
TensorRT INT8 is great for the perception backbone — convolutions and attention layers tolerate quantization well with a representative calibration set. We are more conservative on the action head: FP16 here, because small numerical drift in the predicted action distribution shows up as noisy motion that the controller has to filter out.
The trade is worth it. Perception is the heavy stage; quantizing it to INT8 frees enough budget to keep the action head in FP16 without missing the control rate.
Graceful Retract on Timeout
If any perception or policy stage misses its deadline, the watchdog issues a retract command: lift the tool a few centimeters along the surface normal, hold position, and re-issue the perception request. From the operator's perspective the robot pauses; from the system's perspective it is in a known safe state, not executing a stale plan.
This is far more useful than alarming and dropping out, which is what most off-the-shelf inference timeouts default to.
Inside an Enclosed Wash Bay
Wash bays are warm and humid. The Jetson runs at a fixed power mode chosen to keep junction temperature comfortably below thermal throttling under continuous load, with airflow assisted by a sealed fan duct. We monitor `tegrastats` continuously and alert if sustained throttling is detected — a thermal-throttled inference path is a silent latency regression that will not show up in any unit test.
Read the latency notes
Visit handybot.ai →Scheduling apps don't clean cars. A look at why the next wave of operational leverage in service industries is embodied.