Server-to-Server
Server-to-server is what we call the pattern where a CMDOP agent on one
machine asks the agent on another machine to do something. Not “run a
shell command” — for that there is cmdop connect exec — but “ask the
remote LLM to look at this and reply”. It is the feature that lets your
laptop’s agent ask the prod-1 agent to inspect logs, then ask db-1 to
validate the schema, all from a single chat turn.
The single funnel
Every cross-machine agent call goes through one piece of code:
internal/connect/remoteagent/client.go:79–101. The funnel exposes two
methods:
Ask(ctx, opts) (*AskResult, error)— one-shot, full reply in one struct.AskStream(ctx, opts, handler) (*AskStreamResult, error)— token stream, handler invoked per event.
Both internally do: workspace resolve → fuzzy machine resolve → online
precondition → dial → SetMachine → AgentService.Run (or Stream).
The timeout is clamped to [1ms, 600s] with a 120-second default.
This funnel is what ask_agent, ask_agent_stream, ask_agents, and
the desktop inspector chat all use. Behavior is identical from the
caller’s perspective; the only difference is fan-out vs single-target.
The three tools
The agent’s builtin tools that ride on the funnel:
| Tool | When to use |
|---|---|
ask_agent | One target, fire-and-forget. You want the final reply. |
ask_agent_stream | One target, you want to surface tokens to a UI as they arrive. |
ask_agents | Many targets in parallel. You want to compare answers. |
All three accept hostname (or hostnames), prompt, and a timeout_ms
override. See internal/agent/builtin/tools/connecttool/ for the full
surface.
ask_agent
Single host, unary reply. Simplest invocation:
{
"tool": "ask_agent",
"hostname": "prod-api-1",
"prompt": "Tail the last 100 lines of /var/log/nginx/error.log and summarize anomalies."
}Returns the remote agent’s reply as a single string plus metadata (latency, tool calls the remote agent made, etc.). Useful for “ask the agent on machine X what it sees” without UI streaming.
ask_agent_stream
Same input shape, streaming output. Each event is one of:
| Event | Meaning |
|---|---|
TOKEN | Token chunk to append to the reply buffer. |
TOOL_START | Remote agent started a tool call. |
TOOL_END | Remote agent finished a tool call. |
THINKING | Reasoning trace (verbose; off by default). |
ERROR | Remote agent reported a failure mid-stream. |
HANDOFF | Subagent handoff. |
CANCELLED | The run was cancelled mid-stream. |
The desktop inspector chat consumes this stream directly — see
desktop-inspector-chat. When a daemon is
older than the per-token-events change (2026-04-26), it emits a single
TOKEN event at the end with the full reply; the funnel handles both
shapes transparently.
ask_agents
Multi-target fan-out. Implementation lives in
internal/agent/builtin/tools/connecttool/ask_agents.go:79–237.
{
"tool": "ask_agents",
"hostnames": ["vps-audi", "vps-bmw", "prod-api-1"],
"prompt": "What is your free disk percentage on / and where is it pointing?",
"timeout_ms": 60000
}Behavior:
- Per-host timeout clamped to
[1s, 300s], default 120s. - Total deadline clamped to
[1s, 600s], default 240s. - Dedup — duplicate hostnames are removed while preserving input order, so the result map is deterministic.
- Result map keyed by hostname, each value tagged as
Response,RemoteError, orError. - Cancellation — when the total deadline fires, in-flight workers
are cancelled and unfinished hosts are listed in
TimedOut.
Result envelope:
{
"results": {
"vps-audi": { "type": "Response", "reply": "Free 42% on /, ext4." },
"vps-bmw": { "type": "Response", "reply": "Free 11% on /, btrfs." },
"prod-api-1": { "type": "Error", "class": "offline" }
},
"timed_out": []
}Error taxonomy
Errors are classified so prompts can branch on them sensibly:
| Class | Cause | Where to look |
|---|---|---|
resolve_error | Unknown or ambiguous hostname. | machines — fuzzy resolution rules. |
offline | Target machine is_online=false. | cmdop agent status on the target. |
dial_error | Network or TLS failure between caller and relay. | Local relay logs. |
auth_error | No API key resolved or OAuth expired. | cmdop login, cmdop connect key get. |
remote_error | Target agent reached but reported a failure. | Target machine’s daemon logs. |
timeout | Per-host or total deadline fired. | Tighten timeout_ms or split into smaller fan-outs. |
The Error variant returned by ask_agents carries the class and a
human-readable message. RemoteError is reserved for the case where the
remote agent acknowledged the request but its own run failed —
informationally distinct from “we never reached it”.
Permissions live on the receiver
A subtle but important rule: the permission gate fires on the target
machine, not on the caller. Outgoing ask_agent from your laptop is
not gated locally (you, the operator, are running the laptop’s CLI).
The receiver’s permissions.yaml decides whether the inbound prompt’s
tool calls may execute.
The lone exception is self-to-self calls: when the caller and target
share the same OAuth identity (verified via CallerHostname), the
receiver bypasses the gate. This makes a single operator’s
multi-machine flow ergonomic without a permission carve-out.
The receiver decides what tools the caller may invoke. See ../concepts/agent-communication and ../guides/permissions/rule-grammar for the rule DSL.
Self-to-self calls (same OAuth user) skip the permission gate by
design. If you want to hard-gate everything, run the receiver under a
different account or set strict mode in permissions.yaml.
When to use which
A short decision flow:
- Need a shell command? Use
cmdop connect exec. Cheaper, simpler, no LLM involved. - Need the remote LLM’s opinion on one machine? Use
ask_agent. - Want to surface tokens to a UI? Use
ask_agent_stream. - Need to compare answers across many machines? Use
ask_agents.
If your prompt mixes “run a command” and “have the remote LLM
interpret the result”, ask_agent is the right tool — it can call
shell tools on its end and surface the synthesized answer.
Latency model
Each fan-out call pays:
- Caller resolve / dial / auth. Cached after the first call in a daemon process.
- Per-host RTT. Relay → target → relay → caller.
- Remote agent run time. Whatever the LLM and any tool calls take on the far side.
For single-target calls, the dominant cost is the remote agent run. For fan-out, parallel goroutines mean total time approximates the slowest host (clamped by the total deadline).
Streaming caveat
The ask_agent_stream event types described above are what the
funnel emits. As of 2026-04-26 the daemon ships per-token events, so a
streaming UI can append tokens as they arrive. If the target daemon
predates that change, the funnel still works but emits a single
TOKEN event with the full reply at the end — the UI behaves as if
the call were unary. Upgrading the target daemon flips it to true
streaming with no client change.
Related
Path A — direct pipe to a remote agent from the desktop UI.
Path B — local LLM with remote tools enabled.
Persistent multi-command sessions for long-running tasks.
The conceptual model and permission rules behind these calls.