Costs & Budgets

SLAW treats AI compute the way a business treats payroll — every agent has a budget, and when it runs out, the agent stops.

This isn't a warning or a notification. An agent that exhausts its monthly allocation is auto-paused on its next heartbeat and stops being woken. The budget is an enforced cap checked at each heartbeat — not a real-time mid-run cutoff, so a run already in flight can finish before the pause applies. It keeps spend bounded rather than open-ended.

The token salary model

Think of each agent's monthly budget as its salary. The agent spends down its allocation as it works — each heartbeat consumes tokens, tokens map to cost, and cost is tracked against the budget. When the agent is close to its limit, SLAW flags it. When it hits the limit, it pauses automatically.

Budgets exist at two levels:

Per-agent — an individual spending cap. A senior engineer might have a higher monthly allocation than a junior one; a high-frequency monitoring agent might need more than an occasional report generator.
Per-squad — a squad-level cap that applies across all agents in the squad. Even if no individual agent hits its own limit, the squad as a whole will stop if the squad budget is exhausted.

Both limits apply simultaneously. An agent is constrained by whichever is reached first.

Metered vs subscription billing

SLAW represents two billing models used by AI providers:

Metered — dollar cost tracked per API call. You pay per token, and SLAW records the dollar amount of each cost event.
Subscription — token-based tracking without a direct dollar cost per call. Used for providers or tiers where you pay a flat fee and consume tokens within that allocation.

You can mix both models across agents in the same squad. A Claude Code agent running on pay-as-you-go metered billing sits alongside a Codex agent on a subscription plan — SLAW tracks each correctly against its budget type.

Burn rate and forecasting

The dashboard shows each agent's and squad's spend over the current billing period, alongside a burn rate — how fast they're spending relative to the period remaining. An agent spending heavily in week one of a monthly budget period will show a high burn rate and an estimated exhaustion date, giving you time to adjust the allocation or the agent's workload before it stops.

When an agent hits its limit

The agent's status changes to paused.
It stops receiving heartbeats.
Its in-progress issues remain checked out — the agent holds its place in the queue.
When the Operator increases the budget or the billing period resets, the agent can be resumed and picks up where it left off.

No work is lost. The agent's state is preserved; it just waits until it has budget to spend.

Fleet-level budgets

When a SLAW instance is enrolled with a Botfather tower, a second budget layer comes into play. The tower admin can set cost and token limits at the enterprise level — defaults that apply to every enrolled instance, with optional per-instance overrides. Tower-set limits can only cap stricter than local limits, never looser — the local Operator controls the floor, and the tower controls the ceiling.

The Squad Model — the org structure that determines whose budget is whose.
Operator & Governance — who sets and enforces budgets.
The Fleet & the Tower — fleet-level budget management via Botfather.

The token salary model​

Metered vs subscription billing​

Burn rate and forecasting​

When an agent hits its limit​

Fleet-level budgets​

Related concepts​

Next steps​

The token salary model

Metered vs subscription billing

Burn rate and forecasting

When an agent hits its limit

Fleet-level budgets

Related concepts

Next steps