Your engineers run Claude Code. Your designers are in Cowork. Half the company has Claude open in a browser tab, and a few are on Cursor. It’s on their laptops, each person authenticated a different way, and none of it touches your gateway. The only record you get is one lump-sum bill at the end of the month.
Now you can capture it where it happens – on the laptop. A collector there sees the AI calls your gateway never will: Claude Code, Cowork, desktop chat, the browser, however they’re authenticated, including Claude bought through Vertex. It shows up within seconds, breaks out by model, provider, agent, and user within minutes, and lands fully allocated to the person and the work behind it within hours.
A gateway only sees what you route through it. The largest, least visible share of your AI spend lives out on laptops, outside any proxy, which is where gateway-based tooling goes dark.
Why this matters
That laptop work is the AI spend that maps most directly to an outcome, and the spend nobody can otherwise see: an engineer shipping a feature with Claude Code, a marketer building a campaign in Cowork. Capture it and a dollar ties to the work it did, not to a name on a bill.
What we built
A local macOS collector and network extension. It intercepts AI provider traffic on the machine and captures each call: tokens, model, provider, latency. But the raw numbers are the easy part; anyone can get those. The collector also attaches the business context only the endpoint can see: who ran the call, the repo and project behind it, the work it belongs to. That’s what turns activity into allocation, and it streams into the same allocation engine that’s run CloudZero in production for ten years. The collector is built to leave non-AI traffic untouched. It’s a heavier lift than the gateway path: MDM deployment, a security review. In return it reaches the spend and the context the gateway never can.

How design partners use it
It’s available to design partners now, and we run it ourselves: across CloudZero it captures first-party Claude usage (Claude Code, Cowork, chat) and, as of this month, Anthropic served through Google Vertex, end to end. We put an engineer alongside each partner to get through deployment and the security review together, then widen the rollout from a handful of laptops out.