Server-to-Server

Server-to-server is what we call the pattern where a CMDOP agent on one machine asks the agent on another machine to do something. Not “run a shell command” — for that there is cmdop connect exec — but “ask the remote LLM to look at this and reply”. It is the feature that lets your laptop’s agent ask the prod-1 agent to inspect logs, then ask db-1 to validate the schema, all from a single chat turn.

The single funnel

Every cross-machine agent call goes through one piece of code: internal/connect/remoteagent/client.go:79–101. The funnel exposes two methods:

Ask(ctx, opts) (*AskResult, error) — one-shot, full reply in one struct.
AskStream(ctx, opts, handler) (*AskStreamResult, error) — token stream, handler invoked per event.

Both internally do: workspace resolve → fuzzy machine resolve → online precondition → dial → SetMachine → AgentService.Run (or Stream). The timeout is clamped to [1ms, 600s] with a 120-second default.

This funnel is what ask_agent, ask_agent_stream, ask_agents, and the desktop inspector chat all use. Behavior is identical from the caller’s perspective; the only difference is fan-out vs single-target.

The three tools

The agent’s builtin tools that ride on the funnel:

Tool	When to use
`ask_agent`	One target, fire-and-forget. You want the final reply.
`ask_agent_stream`	One target, you want to surface tokens to a UI as they arrive.
`ask_agents`	Many targets in parallel. You want to compare answers.

All three accept hostname (or hostnames), prompt, and a timeout_ms override. See internal/agent/builtin/tools/connecttool/ for the full surface.

`ask_agent`

Single host, unary reply. Simplest invocation:


{
  "tool": "ask_agent",
  "hostname": "prod-api-1",
  "prompt": "Tail the last 100 lines of /var/log/nginx/error.log and summarize anomalies."
}

Returns the remote agent’s reply as a single string plus metadata (latency, tool calls the remote agent made, etc.). Useful for “ask the agent on machine X what it sees” without UI streaming.

`ask_agent_stream`

Same input shape, streaming output. Each event is one of:

Event	Meaning
`TOKEN`	Token chunk to append to the reply buffer.
`TOOL_START`	Remote agent started a tool call.
`TOOL_END`	Remote agent finished a tool call.
`THINKING`	Reasoning trace (verbose; off by default).
`ERROR`	Remote agent reported a failure mid-stream.
`HANDOFF`	Subagent handoff.
`CANCELLED`	The run was cancelled mid-stream.

The desktop inspector chat consumes this stream directly — see desktop-inspector-chat. When a daemon is older than the per-token-events change (2026-04-26), it emits a single TOKEN event at the end with the full reply; the funnel handles both shapes transparently.

`ask_agents`

Multi-target fan-out. Implementation lives in internal/agent/builtin/tools/connecttool/ask_agents.go:79–237.


{
  "tool": "ask_agents",
  "hostnames": ["vps-audi", "vps-bmw", "prod-api-1"],
  "prompt": "What is your free disk percentage on / and where is it pointing?",
  "timeout_ms": 60000
}

Behavior:

Per-host timeout clamped to [1s, 300s], default 120s.
Total deadline clamped to [1s, 600s], default 240s.
Dedup — duplicate hostnames are removed while preserving input order, so the result map is deterministic.
Result map keyed by hostname, each value tagged as Response, RemoteError, or Error.
Cancellation — when the total deadline fires, in-flight workers are cancelled and unfinished hosts are listed in TimedOut.

Result envelope:


{
  "results": {
    "vps-audi":   { "type": "Response",    "reply": "Free 42% on /, ext4." },
    "vps-bmw":    { "type": "Response",    "reply": "Free 11% on /, btrfs." },
    "prod-api-1": { "type": "Error",       "class": "offline" }
  },
  "timed_out": []
}

Error taxonomy

Errors are classified so prompts can branch on them sensibly:

Class	Cause	Where to look
`resolve_error`	Unknown or ambiguous hostname.	machines — fuzzy resolution rules.
`offline`	Target machine `is_online=false`.	`cmdop agent status` on the target.
`dial_error`	Network or TLS failure between caller and relay.	Local relay logs.
`auth_error`	No API key resolved or OAuth expired.	`cmdop login`, `cmdop connect key get`.
`remote_error`	Target agent reached but reported a failure.	Target machine’s daemon logs.
`timeout`	Per-host or total deadline fired.	Tighten `timeout_ms` or split into smaller fan-outs.

The Error variant returned by ask_agents carries the class and a human-readable message. RemoteError is reserved for the case where the remote agent acknowledged the request but its own run failed — informationally distinct from “we never reached it”.

Permissions live on the receiver

A subtle but important rule: the permission gate fires on the target machine, not on the caller. Outgoing ask_agent from your laptop is not gated locally (you, the operator, are running the laptop’s CLI). The receiver’s permissions.yaml decides whether the inbound prompt’s tool calls may execute.

The lone exception is self-to-self calls: when the caller and target share the same OAuth identity (verified via CallerHostname), the receiver bypasses the gate. This makes a single operator’s multi-machine flow ergonomic without a permission carve-out.

The receiver decides what tools the caller may invoke. See ../concepts/agent-communication and ../guides/permissions/rule-grammar for the rule DSL.

Self-to-self calls (same OAuth user) skip the permission gate by design. If you want to hard-gate everything, run the receiver under a different account or set strict mode in permissions.yaml.

When to use which

A short decision flow:

Need a shell command? Use cmdop connect exec. Cheaper, simpler, no LLM involved.
Need the remote LLM’s opinion on one machine? Use ask_agent.
Want to surface tokens to a UI? Use ask_agent_stream.
Need to compare answers across many machines? Use ask_agents.

If your prompt mixes “run a command” and “have the remote LLM interpret the result”, ask_agent is the right tool — it can call shell tools on its end and surface the synthesized answer.

Latency model

Each fan-out call pays:

Caller resolve / dial / auth. Cached after the first call in a daemon process.
Per-host RTT. Relay → target → relay → caller.
Remote agent run time. Whatever the LLM and any tool calls take on the far side.

For single-target calls, the dominant cost is the remote agent run. For fan-out, parallel goroutines mean total time approximates the slowest host (clamped by the total deadline).

Streaming caveat

The ask_agent_stream event types described above are what the funnel emits. As of 2026-04-26 the daemon ships per-token events, so a streaming UI can append tokens as they arrive. If the target daemon predates that change, the funnel still works but emits a single TOKEN event with the full reply at the end — the UI behaves as if the call were unary. Upgrading the target daemon flips it to true streaming with no client change.

Path A — direct pipe to a remote agent from the desktop UI.

Path B — local LLM with remote tools enabled.

Persistent multi-command sessions for long-running tasks.

The conceptual model and permission rules behind these calls.