agentic experience for Go

ax-go is a Go library that makes CLIs predictable for LLM agents: trace IDs that cross the plugin boundary, one function behind the human table and the agent's JSON, and deterministic output.

2026-06-17

#projects #agents #ai #go

Years ago at RightScale I learned more about distributed systems from broken log files than from any design doc. A request came in the front door, fanned out through a workflow service, hit a plugin, and the plugin called some cloud API. When it failed, the only way to find where was to line up the logs of every service it passed through. So we threaded a trace ID from the frontend all the way to the cloud call and back. With it, a failure was a search query. Without it, you were guessing.

the lesson came back

It stuck. I just had to learn it twice.

Building finfocus, a FinOps CLI that talks to cloud providers through gRPC plugins, I hit the same wall. Except this time there was a second reader who couldn’t follow the logs: the agent. I’d ask Claude to find why a plugin call failed, and it would dig through the finfocus logs, reach the gRPC boundary, and find nothing on the other side. Unconnected traces, or no traces at all. I’d reached for zerolog early, but I’d wired it up wrong. The CLI logged. The plugin logged. Nothing tied the two together, so neither of us could see across the boundary.

Once the trace ID actually crossed that boundary, troubleshooting got fast. The agent could make a call, follow it from the finfocus CLI through the gRPC plugin and back, and tell me which side broke. Bug hunting went from a séance to a grep.

the second wall

The next problem was stranger. finfocus has a TUI: pretty tables, built for human eyes. Ask an agent to read one and it gets the numbers wrong, because a table is laid out for a person, not a parser. Back then, agents were bad at this.

So every command that renders a table also got a --json that runs the same function as the table does. Two renderings, one source of truth: pretty for humans, structured for agents. They can’t disagree about the total, because the total is computed once.

canonizing it

Then I did all of it again in gh-aw-fleet.

By the time I was copying the same plumbing into a third project (stream separation, trace propagation, structured errors, a JSON twin for every human view), the question answered itself. Why am I rewriting this per repo? Canonize it once, let other people use it, and get the wisdom of the crowd (and the clankers) to make it better.

That’s ax-go: Agentic Experience for Go.

what it is

ax-go is a single Go package (github.com/rshade/ax-go, imported as ax) that encodes the conventions a CLI needs so an agent can use it as reliably as a human can. The rules I kept rediscovering, written down once:

stdout is data, stderr is everything else. The final JSON payload is the only thing on stdout. Logs, progress, and error envelopes go to stderr. An agent pipes stdout into a parser; you still read the logs.
Traces that cross the boundary. W3C Trace Context rides context.Context through OpenTelemetry, so one call stays correlated from the CLI to whatever it calls.
Same input, same bytes. Two runs on the same input produce byte-identical stdout. An agent diffs outputs to catch drift, which is the machine version of trust.
A __schema command. Every tool can describe its own commands, flags, and types as JSON, so an agent grounds itself instead of guessing. There’s an MCP adapter too.
Agent-safety primitives. An auto-generated --idempotency-key so a retried create can’t run twice, a universal --dry-run, deterministic exit codes, and a structured error envelope.

where it’s at

It’s released and pre-1.0. v0.1.0 is the pinnable tag, with output contracts already frozen in code and pinned by golden tests, so the shapes an agent depends on won’t move underneath it.

It starts with what I use: zerolog and OpenTelemetry. Other structured loggers and output formats will follow, but I’d rather ship the opinions I’ve tested in finfocus and gh-aw-fleet than guess at the ones I haven’t. It’s the common DNA for my own tools first. If it’s useful to yours, even better.

Years later, I’m still threading trace IDs across plugin boundaries. The only difference is who’s reading them now.