Skip to Content

Usage & quotas

Usage reports show what your workspace consumed this cycle and lets you wire alerts before you hit a hard limit.

Usage dashboard

Current-cycle usage is broken down by:

  • Machine — commands run, AI tokens spent in chat sessions targeting that machine.
  • Member — chat tokens, schedule trigger count, command count.
  • Skill — invocations and tokens.
  • AI provider — Anthropic, OpenAI, Google, etc., with per-model breakdown.

Quota types

Each plan caps multiple resources:

  • Machine count — registered machines, regardless of online status.
  • AI tokens — input + output across all chat surfaces.
  • Storage — session transcripts, audit log, file index.
  • Schedule runs — number of triggered runs per cycle.
  • API requests — REST and gRPC, per minute and per cycle.

Quota alerts

Alerts fire at 80 %, 95 %, and 100 % of every quota:

  • Email to all Owners.
  • Slack to the channel configured in Workspace settings → Integrations.
  • Webhook with a JSON payload ({quota, threshold, current, cycle_ends_at}).

You can mute thresholds per quota type — useful for low-priority limits like API requests.

Hard vs soft limits

LimitWhat happens at 100 %
Machine countHard — new cmdop connect registrations rejected. Existing agents keep working.
AI tokensHard — gateway returns quota error to chat / agent calls.
StorageSoft — older transcripts pruned per retention policy.
Schedule runsSoft — runs queue with a warning; resume next cycle.
API requestsSoft (per-minute) / Hard (per-cycle).

Hitting the machine quota stops new registrations but does not stop running agents. Hitting the AI quota returns gateway errors to chat / agent calls until you top up or upgrade.

Historical usage

Browse the last 12 cycles and export per-cycle breakdowns as CSV. Useful for finance reconciliation and spotting anomalies.

Reducing usage

Practical levers, in order of impact:

  • Skill cache — cache deterministic skill outputs to skip redundant LLM calls.
  • Model choice — small models for routine triage, large for hard problems.
  • Schedule frequency — cron-style daily beats every-5-minutes for status checks.
  • Trim agent context — short, well-scoped chats use fewer tokens than open-ended sessions.

Where this data lives

Aggregates come from the Django activity app (command counts) and profiles app (billing aggregates).

Last updated on