Usage & quotas
Usage reports show what your workspace consumed this cycle and lets you wire alerts before you hit a hard limit.
Usage dashboard
Current-cycle usage is broken down by:
- Machine — commands run, AI tokens spent in chat sessions targeting that machine.
- Member — chat tokens, schedule trigger count, command count.
- Skill — invocations and tokens.
- AI provider — Anthropic, OpenAI, Google, etc., with per-model breakdown.
Quota types
Each plan caps multiple resources:
- Machine count — registered machines, regardless of online status.
- AI tokens — input + output across all chat surfaces.
- Storage — session transcripts, audit log, file index.
- Schedule runs — number of triggered runs per cycle.
- API requests — REST and gRPC, per minute and per cycle.
Quota alerts
Alerts fire at 80 %, 95 %, and 100 % of every quota:
- Email to all Owners.
- Slack to the channel configured in Workspace settings → Integrations.
- Webhook with a JSON payload (
{quota, threshold, current, cycle_ends_at}).
You can mute thresholds per quota type — useful for low-priority limits like API requests.
Hard vs soft limits
| Limit | What happens at 100 % |
|---|---|
| Machine count | Hard — new cmdop connect registrations rejected. Existing agents keep working. |
| AI tokens | Hard — gateway returns quota error to chat / agent calls. |
| Storage | Soft — older transcripts pruned per retention policy. |
| Schedule runs | Soft — runs queue with a warning; resume next cycle. |
| API requests | Soft (per-minute) / Hard (per-cycle). |
Hitting the machine quota stops new registrations but does not stop running agents. Hitting the AI quota returns gateway errors to chat / agent calls until you top up or upgrade.
Historical usage
Browse the last 12 cycles and export per-cycle breakdowns as CSV. Useful for finance reconciliation and spotting anomalies.
Reducing usage
Practical levers, in order of impact:
- Skill cache — cache deterministic skill outputs to skip redundant LLM calls.
- Model choice — small models for routine triage, large for hard problems.
- Schedule frequency — cron-style daily beats every-5-minutes for status checks.
- Trim agent context — short, well-scoped chats use fewer tokens than open-ended sessions.
Where this data lives
Aggregates come from the Django activity app (command counts) and profiles app (billing aggregates).
Related
- Subscriptions — change plan or add seats.
- Payments — top up the wallet.
- Workspace settings — webhook integrations.