Agent Loop

Every CMDOP chat — local or remote — runs the same loop. Each turn resolves a tool catalogue, sanitises history, calls the LLM, executes tool calls, and decides whether to keep going. This page describes the cycle so you can debug it when something looks off.

The cycle in one diagram

The loop lives in internal/agent/runner/loop.go:61–300. The framework is the typed generic TypedRunner[D, O] in internal/agent/runner/runner.go:1–120 where D is the dependency type and O is the output type.

Pseudocode


for turn = 1..maxTurns {
    hidden    := platform.HiddenTools()
    tools     := registry.Active()
    if turn == 1 && allowedToolTags != nil {
        tools = filterByTags(tools, allowedToolTags)
    }
    tools = subtract(tools, hidden)

    msgs := sanitize(history)             // drop orphan tool_calls
    req  := buildRequest(model, msgs, tools, cacheBreakpoints, temp=0.3)
    resp := callLLM(req, retries=6, backoff=500ms..32s)

    history = append(history, resp.Message)
    if len(resp.ToolCalls) == 0 {
        return resp.Message
    }
    results := execTools(resp.ToolCalls, parallel=config.ParallelTools)
    history = append(history, results...)
}

What happens on each turn

1. Inject system context

The runner injects fresh system reminders (NOT new messages — they keep history clean):

SectionIdentity — I'm CMDOP Operator on <hostname>.
SectionEnvInfo — local OS, timezone, user (suppressed when targeting a remote machine).
SectionTargetMachine — your client is asking from vps-foo, use ask_agent (only for CLI Path B, cmdop chat --machine).
SectionCriticalTools — prefer tools for work (safety net).
SectionNoToolsHonesty — refuse to fabricate output when no tools are available.
SectionInspectorLinks — prefer cmdop:// hrefs in file references.
SectionPlatformContext — platform-specific hints (browser, mobile, vps).

The static prefix is split on prompts.CacheBoundaryMarker so Anthropic prompt caching applies to everything before the marker.

2. Resolve tools

Three filters run in order:

Intent router (default OFF). Re-enable with CMDOP_INTENT_ROUTER=on. When ON, a cheap classifier decides DIRECT vs AGENT. DIRECT skips tool defs (~2000 tokens saved). See internal/agent/router/CLAUDE.md for why it is off.
Tool-category router (turn 1 only). When RunContext.AllowedToolTags is set, the catalogue is restricted to tools whose Config.Tags intersect the allowlist. Core tools are always included. Cleared after turn 1.
Hide map. HiddenTools is consulted last so platform / mode rules can drop individual tools even if the LLM hallucinates a call later.

3. Sanitize history

The runner drops orphan tool calls — entries where the assistant requested a tool but no matching result followed. This protects providers that reject mismatched call/result pairs.

4. Build the LLM request


ChatRequest{
    Model:            agent.Model,         // alias or concrete model
    Messages:         convert(history),
    Tools:            convert(tools),      // omit if no tools needed
    ToolChoice:       "auto",
    Temperature:      0.3,                 // fixed
    MaxTokens:        budget,
    CacheBreakpoints: []int{0, ...history-based...},
}

5. Call the LLM with retry

internal/agent/llm/retry.go wraps the provider with 6 attempts and exponential backoff from 500 ms up to 32 s. Streaming is optional via StreamingProvider; when a stream dies near the deadline, the runner falls back inline to non-streaming. Streaming idle timeout is 2 s per token batch.

6. Execute tool calls

If config.ParallelTools is true, tool calls run concurrently — handy for ask_agents or multiple read_file calls. Otherwise they execute sequentially.

Each tool result is appended to history with the matching tool_call_id so the next turn sees a clean conversation.

Termination conditions

The loop ends when any of these is true:

The LLM response has no tool calls (final answer).
maxTurns is reached. Configurable per agent.
The parent context is cancelled (Go context.Context).
The UI cancel checker fires (CancelChecker interface — bypasses context deadline pressure so a user can kill a long run instantly).

Cancellation has two channels

Context. Standard Go cancellation; respects deadlines.
CancelChecker. UI-driven (Desktop or TUI button). Honoured immediately even when the context still has time left.

Both are checked at every loop boundary.

Streaming events

When streaming is enabled, the runner pushes events onto RunContext.Stream (a *StreamBus). Consumers see:

TOKEN — next text fragment.
TOOL_START / TOOL_END — tool execution markers.
THINKING — provider thinking marker (when supported).
ERROR / CANCELLED — terminal events.

Same event vocabulary appears on ask_agent_stream — see Agent Communication.

Cache boundaries

The runner sets cache breakpoints at index 0 (start of the static system prefix) plus history-based positions. Anthropic providers honour these as cache_control: ephemeral, which dramatically reduces token cost on long sessions.

What can derail the loop

Symptom	Likely cause
Agent answers without using tools	Intent router on, classifier ruled DIRECT — disable with `CMDOP_INTENT_ROUTER=off`
Specific tool never gets called	Tool listed in `HiddenTools` for this platform / mode
Loop hits max turns	Tool keeps reporting partial work; check tool implementations and `truncated` flags
LLM errors → fallback to text	Streaming died near deadline; expected, retried

The runner is the only place that calls the LLM. If you see provider-specific behaviour inside a tool, that tool is ignoring the framework — file an issue.

TAGS: agent-loop, runner, llm, retries, streaming DEPENDS_ON: [agents, tools, memory]