Persistent Remote Sessions
A persistent session is a long-lived shell-like context on a remote machine that survives across many tool calls within one chat turn (and beyond). It is the right model when “one shot exec” is too coarse — when you need to send a sequence of related commands, watch their output as it trickles in, and only close the session when the work is done.
The implementation lives in internal/connect/sessionmgr/ and is exposed
to agents as the ssh_session tool.
When to use it
The decision tree:
- One command, finishes in seconds. Use
cmdop connect exec. - Interactive shell, human at the keyboard. Use interactive-attach.
- Many commands stitched together by an agent over many turns. Use a persistent session.
Concrete use cases:
- An agent doing a multi-step build/deploy that wants to read intermediate output before deciding the next step.
- Tailing a slow log while running diagnostic commands in the same context.
- Keeping a long-running process (compile, container build, migration)
alive across multiple
readpolls without re-establishing auth.
A persistent session is a sessionmgr object. It is not the same as a
terminal session opened by cmdop connect. The former is many commands
on one ringbuf; the latter is one PTY the user is interacting with.
See ../concepts/sessions for both kinds.
Lifecycle
Sessions move through five states:
opening → ready → busy → closing → closedopening— Manager has accepted the request and is dialing the target.ready— The session is dialed and idle, waiting for the next command.busy— A command is running. Output is flowing into the ring buffer.closing— Close has been requested; manager is finishing in-flight IO.closed— Terminal state. Session ID can no longer be addressed.
State transitions and the manager goroutine live in
internal/connect/sessionmgr/types.go:18–78 and manager.go:26–183.
Manager defaults
Manager.Open / Close / Get / List / Count is the internal API; the
agent-facing surface is the ssh_session tool. Defaults:
- Max sessions per process: 64. Hitting the cap returns an error
from
Open. - Ring buffer size: 1 MiB per session.
- Idle TTL: 30 minutes. After 30 minutes with no activity, a reaper goroutine closes the session.
- State machine grace: Closing is graceful — in-flight reads drain
before the session moves to
closed.
These are tuneable per session via the Open arguments; see “Idle
TTL” below.
The agent tool surface
ssh_session exposes five operations on a session:
| Operation | Effect |
|---|---|
open | Dial a new session on a target machine. Returns a session_id. |
send | Send a command. Output goes into the ring buffer. |
read | Poll output since a given offset. Returns (data, next_cursor, truncated). |
list | List the caller’s open sessions. |
close | Close a session and free its ring buffer. |
A typical agent sequence:
{ "tool": "ssh_session", "operation": "open",
"hostname": "vps-bmw", "idle_ttl_seconds": 0 }
// → { "session_id": "sess_3a2b...", "state": "ready" }
{ "tool": "ssh_session", "operation": "send",
"session_id": "sess_3a2b...", "command": "make build 2>&1" }
{ "tool": "ssh_session", "operation": "read",
"session_id": "sess_3a2b...", "offset": 0 }
// → { "data": "...", "next_cursor": 4096, "truncated": false }
{ "tool": "ssh_session", "operation": "read",
"session_id": "sess_3a2b...", "offset": 4096 }
// → { "data": "...", "next_cursor": 8192, "truncated": false, "exit_code": 0 }
{ "tool": "ssh_session", "operation": "close", "session_id": "sess_3a2b..." }The ring buffer
Long-running commands can dump megabytes of output. A naive design
either bounds memory poorly or drops data without telling the caller.
sessionmgr uses a ring buffer with monotonic offsets and an explicit
truncated flag.
Design points (see internal/connect/sessionmgr/ringbuf.go:14–84):
- Default size: 1 MiB. Tunable per session.
- Monotonic offsets. Every byte ever written to the session has a stable offset, even after wrap-around. Callers track a cursor.
- Drop oldest on overflow. When the buffer fills, the writer drops
the oldest data and sets
truncated=trueon subsequentSincereads. Since(offset)— Returns(data, next_cursor, truncated). Iftruncated=true, some data between the old offset and the start of the returned slice was dropped.
Truncation handling pattern:
loop:
data, cursor, truncated = read(offset=cursor)
if truncated:
surface a warning ("dropped X bytes")
optionally back off and read with smaller windows
consume(data)
offset = cursorThe flag is informational today — no consumer aborts on
truncated=true. But it lets careful agents detect data loss between
polls.
Idle TTL and the reaper
The manager runs a reaper goroutine that closes sessions whose last activity is older than the idle TTL. The default is 30 minutes:
- Read or send activity bumps a
last_activitytimestamp. - The reaper sweeps every minute or so and closes anything past TTL.
- Closed sessions free their ring buffer immediately.
For tasks expected to take longer than 30 minutes (long builds, log
tails, big migrations), pass idle_ttl_seconds: 0 on open to
disable the reaper for that session:
{ "tool": "ssh_session", "operation": "open",
"hostname": "prod-api-1", "idle_ttl_seconds": 0 }Persistent sessions reserve memory (1 MiB ringbuf each by default). The cap is 64 per process; if you fan out heavily across board automations, watch the count and close sessions you no longer need.
Use idle_ttl_seconds: 0 for builds, deploys, or any task expected to
take longer than 30 minutes. Otherwise the reaper will close the
session out from under you.
Driver injection
sessionmgr does not talk to the relay directly; it accepts a
Driver interface. Production wires:
connectclient.Remotefor the dial.SetMachineonce per dial.Execute(unary) andAttachStream(bidi) for the actual IO.
Tests inject a fake driver. This split is the reason the package can
be unit-tested without standing up a real relay. See
internal/connect/sessionmgr/driver.go and the test fixtures
alongside it.
Reattach and identity
Session IDs come in two flavors:
- Daemon-issued. Format
host-uuid:slotN. These are whatsessionmgrassigns when an agent callsopen. - Desktop-minted. Format
machine_<uuid>_<hex>. These are machine-scoped chat sessions used by the desktop inspector — see desktop-inspector-chat.
A session ID is valid for the lifetime of the daemon process that holds it. After daemon restart, sessions are not recovered — they were live in-memory state, not durable rows.
Multiple readers
Multiple agents (or multiple turns of one agent) can read the same
session concurrently. Each maintains its own offset. The ring buffer
serves all of them from the same underlying storage. Concurrent
send is not supported on a single session — if a send arrives
while the session is busy, it queues until the previous command
finishes.
Failure modes
| Symptom | Cause | Fix |
|---|---|---|
session not found after a long pause. | Reaper closed it on idle TTL. | Re-open; pass idle_ttl_seconds: 0 next time. |
truncated=true and you missed output. | Ring buffer wrapped between polls. | Poll more often, or accept the gap. |
manager full (64 sessions) | Too many open sessions. | Close idle ones via ssh_session.close. |
| Long build hangs forever. | The remote process is genuinely stuck — sessionmgr does not enforce a per-command timeout. | Use the agent’s overall turn timeout, or close the session. |
Related
The simpler tool when you only need one command.
ask_agent and ask_agents for prompt-shaped work.
The session-as-object model that underpins this surface.
The auth gate on the underlying gRPC dial.