Skip to Content

Agent Loop

Every CMDOP chat — local or remote — runs the same loop. Each turn resolves a tool catalogue, sanitises history, calls the LLM, executes tool calls, and decides whether to keep going. This page describes the cycle so you can debug it when something looks off.

The cycle in one diagram

The loop lives in internal/agent/runner/loop.go:61–300. The framework is the typed generic TypedRunner[D, O] in internal/agent/runner/runner.go:1–120 where D is the dependency type and O is the output type.

Pseudocode

for turn = 1..maxTurns { hidden := platform.HiddenTools() tools := registry.Active() if turn == 1 && allowedToolTags != nil { tools = filterByTags(tools, allowedToolTags) } tools = subtract(tools, hidden) msgs := sanitize(history) // drop orphan tool_calls req := buildRequest(model, msgs, tools, cacheBreakpoints, temp=0.3) resp := callLLM(req, retries=6, backoff=500ms..32s) history = append(history, resp.Message) if len(resp.ToolCalls) == 0 { return resp.Message } results := execTools(resp.ToolCalls, parallel=config.ParallelTools) history = append(history, results...) }

What happens on each turn

1. Inject system context

The runner injects fresh system reminders (NOT new messages — they keep history clean):

  • SectionIdentityI'm CMDOP Operator on <hostname>.
  • SectionEnvInfo — local OS, timezone, user (suppressed when targeting a remote machine).
  • SectionTargetMachineyour client is asking from vps-foo, use ask_agent (only for CLI Path B, cmdop chat --machine).
  • SectionCriticalToolsprefer tools for work (safety net).
  • SectionNoToolsHonesty — refuse to fabricate output when no tools are available.
  • SectionInspectorLinks — prefer cmdop:// hrefs in file references.
  • SectionPlatformContext — platform-specific hints (browser, mobile, vps).

The static prefix is split on prompts.CacheBoundaryMarker so Anthropic prompt caching applies to everything before the marker.

2. Resolve tools

Three filters run in order:

  1. Intent router (default OFF). Re-enable with CMDOP_INTENT_ROUTER=on. When ON, a cheap classifier decides DIRECT vs AGENT. DIRECT skips tool defs (~2000 tokens saved). See internal/agent/router/CLAUDE.md for why it is off.
  2. Tool-category router (turn 1 only). When RunContext.AllowedToolTags is set, the catalogue is restricted to tools whose Config.Tags intersect the allowlist. Core tools are always included. Cleared after turn 1.
  3. Hide map. HiddenTools is consulted last so platform / mode rules can drop individual tools even if the LLM hallucinates a call later.

3. Sanitize history

The runner drops orphan tool calls — entries where the assistant requested a tool but no matching result followed. This protects providers that reject mismatched call/result pairs.

4. Build the LLM request

ChatRequest{ Model: agent.Model, // alias or concrete model Messages: convert(history), Tools: convert(tools), // omit if no tools needed ToolChoice: "auto", Temperature: 0.3, // fixed MaxTokens: budget, CacheBreakpoints: []int{0, ...history-based...}, }

5. Call the LLM with retry

internal/agent/llm/retry.go wraps the provider with 6 attempts and exponential backoff from 500 ms up to 32 s. Streaming is optional via StreamingProvider; when a stream dies near the deadline, the runner falls back inline to non-streaming. Streaming idle timeout is 2 s per token batch.

6. Execute tool calls

If config.ParallelTools is true, tool calls run concurrently — handy for ask_agents or multiple read_file calls. Otherwise they execute sequentially.

Each tool result is appended to history with the matching tool_call_id so the next turn sees a clean conversation.

Termination conditions

The loop ends when any of these is true:

  • The LLM response has no tool calls (final answer).
  • maxTurns is reached. Configurable per agent.
  • The parent context is cancelled (Go context.Context).
  • The UI cancel checker fires (CancelChecker interface — bypasses context deadline pressure so a user can kill a long run instantly).

Cancellation has two channels

  • Context. Standard Go cancellation; respects deadlines.
  • CancelChecker. UI-driven (Desktop or TUI button). Honoured immediately even when the context still has time left.

Both are checked at every loop boundary.

Streaming events

When streaming is enabled, the runner pushes events onto RunContext.Stream (a *StreamBus). Consumers see:

  • TOKEN — next text fragment.
  • TOOL_START / TOOL_END — tool execution markers.
  • THINKING — provider thinking marker (when supported).
  • ERROR / CANCELLED — terminal events.

Same event vocabulary appears on ask_agent_stream — see Agent Communication.

Cache boundaries

The runner sets cache breakpoints at index 0 (start of the static system prefix) plus history-based positions. Anthropic providers honour these as cache_control: ephemeral, which dramatically reduces token cost on long sessions.

What can derail the loop

SymptomLikely cause
Agent answers without using toolsIntent router on, classifier ruled DIRECT — disable with CMDOP_INTENT_ROUTER=off
Specific tool never gets calledTool listed in HiddenTools for this platform / mode
Loop hits max turnsTool keeps reporting partial work; check tool implementations and truncated flags
LLM errors → fallback to textStreaming died near deadline; expected, retried

The runner is the only place that calls the LLM. If you see provider-specific behaviour inside a tool, that tool is ignoring the framework — file an issue.

TAGS: agent-loop, runner, llm, retries, streaming DEPENDS_ON: [agents, tools, memory]

Last updated on