knack
← all posts

Claude Code Usage and Credits: The /usage Command, Limits, and the June 15 Billing Change

How to read the new /usage command, what actually drives consumption, and what changed when Agent SDK and headless usage moved onto a separate monthly credit.

Two things changed how Claude Code usage works this spring, and both landed in the same month. In Week 21 (May 18 to 22, 2026), Anthropic shipped a /usage command that finally tells you which skills, subagents, plugins, and MCP servers are eating your plan. Then on June 15, Agent SDK and headless usage moved off your subscription onto a separate credit pool. If you have been staring at a "limit reached" message wondering what you did to deserve it, this is the page that explains the meter.

What /usage actually shows now

Open Claude Code and type /usage. You get two distinct things on one screen, and the difference matters.

The top block is the Session block. It prints token counts and a dollar estimate for your current session, something like a $0.55 total cost over six minutes of API time. That figure is computed locally, on your machine, from token counts. It is an estimate, not a bill. If you are a Pro or Max subscriber, Anthropic is explicit that the dollar number does not touch your billing, because your usage is already covered by the subscription. The Session block exists mostly for raw API users who pay per token.

What earns the command its keyword sits below that. On a Pro, Max, Team, or Enterprise plan, /usage shows a breakdown of what counts against your plan limits, and it attributes recent usage to skills, subagents, plugins, and individual MCP servers, each as a percentage of the total. Press d for the last 24 hours. Press w for the last 7 days. That toggle is the whole point: you can finally see that one chatty MCP server or one runaway subagent ran up 40% of your week.

The docs add a caveat worth taking seriously. The figures are approximate, and they come from local session history on this machine. Usage from another laptop, or from claude.ai in your browser, does not show up here at all. So treat /usage as a per-machine attribution tool rather than a global ledger. (source)

One more thing from that same Week 21 release: Anthropic renamed "Extra usage" to "usage credits" across the CLI, and /extra-usage became /usage-credits. The old command still works. But the rename now reads as a hint about where billing was heading three weeks later. (source)

What actually drives consumption

Token count is the meter. Everything downstream of it is just a different way to inflate that count, and the four categories /usage breaks out behave nothing alike.

MCP servers are the heavy ones. A single server can load dozens of tool definitions into context before you type a single character. Anthropic now defers MCP tool definitions by default, so only tool names enter context until Claude actually calls a tool, and that cuts the upfront cost hard. Run /context to see what is sitting in your window right now, and /mcp to disable servers you forgot you connected. The docs are blunt about this: CLI tools like gh, aws, and gcloud stay cheaper than the equivalent MCP server, because they add zero per-tool listing.

Skills are cheap by design. A skill stays dormant until Claude decides it is relevant, and only then does its body load. The discipline is keeping each skill tight, so the on-demand load stays small. A bloated skill that drags in three thousand tokens every time it fires will show up in your 7-day breakdown soon enough.

Subagents cut both ways, and this is where people get surprised. A subagent runs in its own context window and hands back only a summary, which keeps verbose work like test runs, log parsing, and doc fetching out of your main conversation. That saves tokens in your primary session. Spin up an agent team, though, and you are running several Claude instances at once. Anthropic puts agent teams at roughly 7x the tokens of a standard session when teammates run in plan mode, because each teammate carries its own full context. The /usage breakdown is where that cost stops being a mystery and starts being a number.

Plugins are containers. A plugin bundles skills, subagents, hooks, MCP configs, and slash commands into one install, so a single plugin can quietly bring along an expensive MCP server. The /plugin browse pane now shows projected context cost before you install, which is worth reading before you click. (source)

The limit model: rolling windows, not a monthly tank

Claude Code on Pro and Max does not give you one monthly bucket of usage. It runs on rolling windows, and the usage is shared across surfaces. Activity in claude.ai, Claude Desktop, and Claude Code all draw from the same limit. Burn a chunk of your allowance in a long browser conversation, and you have less left for the terminal.

The window most people hit first is the 5-hour one. The counter starts on your first prompt and resets five hours later. Your "session," in other words, is a rolling clock rather than something you open and close. On top of that sits a weekly limit that resets seven days after your session starts, tied to your own cycle and not to a fixed calendar day. Hit the weekly cap and usage locks until that 7-day reset, no matter how much room your 5-hour window still has.

I am hedging the exact prompt counts on purpose. Third-party guides float figures like roughly 45 prompts per 5-hour window on Pro, with Max at 5x and 20x that. Anthropic's own help pages, though, describe the shared-limit and reset mechanics without committing to fixed per-window numbers, and those numbers have moved before. So treat any specific count you read (including that one) as a community estimate rather than a guarantee. The mechanics are the part you can rely on: two rolling windows, shared across every Claude surface, reset on your own clock. (source)

What changed on June 15

Here is the change that made everyone nervous about cost. As of June 15, 2026, four kinds of usage left the subscription pool and moved onto a separate monthly credit:

  • Claude Agent SDK usage (your Python and TypeScript agent projects)
  • The claude -p command, meaning headless or non-interactive Claude Code
  • Claude Code GitHub Actions
  • Third-party apps authenticated through the Agent SDK

What did not move: interactive Claude Code in your terminal or IDE, web, desktop, and mobile Claude conversations, and Claude Cowork. If you sit in front of the terminal and type, your subscription limits work exactly as they did before. The split only bites automation. (source)

The new credit is a monthly dollar amount, metered at standard API rates. Pro gets $20, Max 5x gets $100, Max 20x gets $200. Team Standard mirrors Pro at $20, and Team Premium mirrors Max 5x at $100. Now the detail that catches people out: it does not roll over. Unused credit at the end of a billing cycle is gone, and a fresh amount lands the next cycle. You claim it once through your Claude account, and after that it renews on its own.

When the credit runs dry, one of two things happens. If you have usage credits enabled, extra automation flows to standard API rates and you keep going on a real bill. If you do not, requests pause until the next refresh. So the practical move before June 15 was to claim the credit and decide, deliberately, whether you want overflow billing on or off for your CI pipelines. (source)

This is why a CI job that ran fine in May might have stalled mid-month in June. A claude -p step in GitHub Actions now draws from a finite $20 or $100 pool instead of your roomy subscription, and once that pool is dry and overflow is off, the job just waits. If you build on the SDK rather than only running headless commands, the Claude Agent SDK guide covers budgeting programmatic agents against this pool.

Practical ways to stretch a plan

Start with attribution. Run /usage, press w, and read the 7-day breakdown before you touch anything. Guessing which MCP server is expensive wastes the very tokens you are trying to save.

Then trim the obvious heavy hitters. Run /mcp and disable servers you are not using this week. Prefer gh over a GitHub MCP server and aws over an AWS one, since the CLI adds no per-tool context. Run /context to see what is loaded right now.

After that, manage context like it costs money, because it does. Use /clear between unrelated tasks so stale context stops riding along on every message. Keep your CLAUDE.md under about 200 lines, and push specialized instructions into skills that load on demand instead of sitting in base context all session. Drop to Sonnet for routine coding and reserve Opus for genuinely hard reasoning with /model. Lower the effort level with /effort on simple tasks, since thinking tokens bill as output. Hand verbose operations like test runs and log parsing to subagents, so the noise stays out of your main window.

Keeping skills tight is the lever most people overlook. It is also where a tool like Knack helps non-coders author SKILL.md files that load lean instead of dragging a thousand stray tokens into context every time they fire.

One last move, specific to the June change. If you run automation, audit your claude -p and Agent SDK jobs now. Every headless call spends from a fixed pool that resets to zero each month, and a single misconfigured loop that retries forever can drain a $20 Pro credit in an afternoon. Worse, /usage on your machine will not even show it, because headless runs on a server never touch your local session history. Check the Usage page in the Claude Console for the authoritative number on automated spend.