Usage & quotas

Usage reports show what your workspace consumed this cycle and lets you wire alerts before you hit a hard limit.

Usage dashboard

Current-cycle usage is broken down by:

Machine — commands run, AI tokens spent in chat sessions targeting that machine.
Member — chat tokens, schedule trigger count, command count.
Skill — invocations and tokens.
AI provider — Anthropic, OpenAI, Google, etc., with per-model breakdown.

Quota types

Each plan caps multiple resources:

Machine count — registered machines, regardless of online status.
AI tokens — input + output across all chat surfaces.
Storage — session transcripts, audit log, file index.
Schedule runs — number of triggered runs per cycle.
API requests — REST and gRPC, per minute and per cycle.

Quota alerts

Alerts fire at 80 %, 95 %, and 100 % of every quota:

Email to all Owners.
Slack to the channel configured in Workspace settings → Integrations.
Webhook with a JSON payload ({quota, threshold, current, cycle_ends_at}).

You can mute thresholds per quota type — useful for low-priority limits like API requests.

Hard vs soft limits

Limit	What happens at 100 %
Machine count	Hard — new `cmdop connect` registrations rejected. Existing agents keep working.
AI tokens	Hard — gateway returns quota error to chat / agent calls.
Storage	Soft — older transcripts pruned per retention policy.
Schedule runs	Soft — runs queue with a warning; resume next cycle.
API requests	Soft (per-minute) / Hard (per-cycle).

Hitting the machine quota stops new registrations but does not stop running agents. Hitting the AI quota returns gateway errors to chat / agent calls until you top up or upgrade.

Historical usage

Browse the last 12 cycles and export per-cycle breakdowns as CSV. Useful for finance reconciliation and spotting anomalies.

Reducing usage

Practical levers, in order of impact:

Skill cache — cache deterministic skill outputs to skip redundant LLM calls.
Model choice — small models for routine triage, large for hard problems.
Schedule frequency — cron-style daily beats every-5-minutes for status checks.
Trim agent context — short, well-scoped chats use fewer tokens than open-ended sessions.

Where this data lives

Aggregates come from the Django activity app (command counts) and profiles app (billing aggregates).

Subscriptions — change plan or add seats.
Payments — top up the wallet.
Workspace settings — webhook integrations.