Skip to Content

Persistent Remote Sessions

A persistent session is a long-lived shell-like context on a remote machine that survives across many tool calls within one chat turn (and beyond). It is the right model when “one shot exec” is too coarse — when you need to send a sequence of related commands, watch their output as it trickles in, and only close the session when the work is done.

The implementation lives in internal/connect/sessionmgr/ and is exposed to agents as the ssh_session tool.

When to use it

The decision tree:

  • One command, finishes in seconds. Use cmdop connect exec.
  • Interactive shell, human at the keyboard. Use interactive-attach.
  • Many commands stitched together by an agent over many turns. Use a persistent session.

Concrete use cases:

  • An agent doing a multi-step build/deploy that wants to read intermediate output before deciding the next step.
  • Tailing a slow log while running diagnostic commands in the same context.
  • Keeping a long-running process (compile, container build, migration) alive across multiple read polls without re-establishing auth.

A persistent session is a sessionmgr object. It is not the same as a terminal session opened by cmdop connect. The former is many commands on one ringbuf; the latter is one PTY the user is interacting with. See ../concepts/sessions for both kinds.

Lifecycle

Sessions move through five states:

opening → ready → busy → closing → closed
  • opening — Manager has accepted the request and is dialing the target.
  • ready — The session is dialed and idle, waiting for the next command.
  • busy — A command is running. Output is flowing into the ring buffer.
  • closing — Close has been requested; manager is finishing in-flight IO.
  • closed — Terminal state. Session ID can no longer be addressed.

State transitions and the manager goroutine live in internal/connect/sessionmgr/types.go:18–78 and manager.go:26–183.

Manager defaults

Manager.Open / Close / Get / List / Count is the internal API; the agent-facing surface is the ssh_session tool. Defaults:

  • Max sessions per process: 64. Hitting the cap returns an error from Open.
  • Ring buffer size: 1 MiB per session.
  • Idle TTL: 30 minutes. After 30 minutes with no activity, a reaper goroutine closes the session.
  • State machine grace: Closing is graceful — in-flight reads drain before the session moves to closed.

These are tuneable per session via the Open arguments; see “Idle TTL” below.

The agent tool surface

ssh_session exposes five operations on a session:

OperationEffect
openDial a new session on a target machine. Returns a session_id.
sendSend a command. Output goes into the ring buffer.
readPoll output since a given offset. Returns (data, next_cursor, truncated).
listList the caller’s open sessions.
closeClose a session and free its ring buffer.

A typical agent sequence:

{ "tool": "ssh_session", "operation": "open", "hostname": "vps-bmw", "idle_ttl_seconds": 0 } // → { "session_id": "sess_3a2b...", "state": "ready" } { "tool": "ssh_session", "operation": "send", "session_id": "sess_3a2b...", "command": "make build 2>&1" } { "tool": "ssh_session", "operation": "read", "session_id": "sess_3a2b...", "offset": 0 } // → { "data": "...", "next_cursor": 4096, "truncated": false } { "tool": "ssh_session", "operation": "read", "session_id": "sess_3a2b...", "offset": 4096 } // → { "data": "...", "next_cursor": 8192, "truncated": false, "exit_code": 0 } { "tool": "ssh_session", "operation": "close", "session_id": "sess_3a2b..." }

The ring buffer

Long-running commands can dump megabytes of output. A naive design either bounds memory poorly or drops data without telling the caller. sessionmgr uses a ring buffer with monotonic offsets and an explicit truncated flag.

Design points (see internal/connect/sessionmgr/ringbuf.go:14–84):

  • Default size: 1 MiB. Tunable per session.
  • Monotonic offsets. Every byte ever written to the session has a stable offset, even after wrap-around. Callers track a cursor.
  • Drop oldest on overflow. When the buffer fills, the writer drops the oldest data and sets truncated=true on subsequent Since reads.
  • Since(offset) — Returns (data, next_cursor, truncated). If truncated=true, some data between the old offset and the start of the returned slice was dropped.

Truncation handling pattern:

loop: data, cursor, truncated = read(offset=cursor) if truncated: surface a warning ("dropped X bytes") optionally back off and read with smaller windows consume(data) offset = cursor

The flag is informational today — no consumer aborts on truncated=true. But it lets careful agents detect data loss between polls.

Idle TTL and the reaper

The manager runs a reaper goroutine that closes sessions whose last activity is older than the idle TTL. The default is 30 minutes:

  • Read or send activity bumps a last_activity timestamp.
  • The reaper sweeps every minute or so and closes anything past TTL.
  • Closed sessions free their ring buffer immediately.

For tasks expected to take longer than 30 minutes (long builds, log tails, big migrations), pass idle_ttl_seconds: 0 on open to disable the reaper for that session:

{ "tool": "ssh_session", "operation": "open", "hostname": "prod-api-1", "idle_ttl_seconds": 0 }

Persistent sessions reserve memory (1 MiB ringbuf each by default). The cap is 64 per process; if you fan out heavily across board automations, watch the count and close sessions you no longer need.

Use idle_ttl_seconds: 0 for builds, deploys, or any task expected to take longer than 30 minutes. Otherwise the reaper will close the session out from under you.

Driver injection

sessionmgr does not talk to the relay directly; it accepts a Driver interface. Production wires:

  • connectclient.Remote for the dial.
  • SetMachine once per dial.
  • Execute (unary) and AttachStream (bidi) for the actual IO.

Tests inject a fake driver. This split is the reason the package can be unit-tested without standing up a real relay. See internal/connect/sessionmgr/driver.go and the test fixtures alongside it.

Reattach and identity

Session IDs come in two flavors:

  • Daemon-issued. Format host-uuid:slotN. These are what sessionmgr assigns when an agent calls open.
  • Desktop-minted. Format machine_<uuid>_<hex>. These are machine-scoped chat sessions used by the desktop inspector — see desktop-inspector-chat.

A session ID is valid for the lifetime of the daemon process that holds it. After daemon restart, sessions are not recovered — they were live in-memory state, not durable rows.

Multiple readers

Multiple agents (or multiple turns of one agent) can read the same session concurrently. Each maintains its own offset. The ring buffer serves all of them from the same underlying storage. Concurrent send is not supported on a single session — if a send arrives while the session is busy, it queues until the previous command finishes.

Failure modes

SymptomCauseFix
session not found after a long pause.Reaper closed it on idle TTL.Re-open; pass idle_ttl_seconds: 0 next time.
truncated=true and you missed output.Ring buffer wrapped between polls.Poll more often, or accept the gap.
manager full (64 sessions)Too many open sessions.Close idle ones via ssh_session.close.
Long build hangs forever.The remote process is genuinely stuck — sessionmgr does not enforce a per-command timeout.Use the agent’s overall turn timeout, or close the session.

The simpler tool when you only need one command.

ask_agent and ask_agents for prompt-shaped work.

The session-as-object model that underpins this surface.

The auth gate on the underlying gRPC dial.

Last updated on