Skip to Content

OS Automation

cmdop os is the GUI-level equivalent of execute_command — keyboard, mouse, screenshot, clipboard, app lifecycle, scripts, and the OS accessibility tree.

These verbs run on the local machine. To drive a remote machine’s GUI, pipe through cmdop connect exec <host> -- cmdop os ... so they execute on that host’s daemon.

Window management

cmdop os list # all windows cmdop os active # active window info (title, app, geometry) cmdop os focus "iTerm2" # focus by title or app name

Output:

$ cmdop os active { "app": "iTerm2", "title": "deploy@vps-audi: ~", "pid": 4567, "rect": [0, 23, 1920, 1080] }

Keyboard & mouse

cmdop os type "hello world" # type as if the user typed cmdop os key cmd+space # send key combo cmdop os click 800 600 # click at coords cmdop os click --right 800 600 # right click cmdop os move 800 600 # move cursor cmdop os scroll --y -300 # scroll

On macOS, Accessibility permissions must be granted to the cmdop binary or the desktop app (System Settings → Privacy & Security → Accessibility). cmdop doctor checks this and prints a hint if missing.

Screenshot & clipboard

cmdop os screenshot # full screen cmdop os screenshot --window "iTerm2" cmdop os screenshot --rect 0,0,800,600 cmdop os screenshot --out /tmp/shot.png cmdop os clipboard # read cmdop os clipboard --set "new value" cmdop os clipboard --type image --out /tmp/clip.png

App lifecycle

cmdop os apps # list running apps cmdop os launch "Visual Studio Code" cmdop os quit "Slack"

Scripts (AppleScript / PowerShell / shell)

cmdop os script --apple 'tell application "Finder" to get name of windows' cmdop os script --pwsh 'Get-Process | Select-Object -First 5' cmdop os script --shell 'launchctl list | head'

The --apple backend runs only on macOS, --pwsh only on Windows; --shell is cross-platform.

UI accessibility tree

cmdop os ui # active window's tree cmdop os ui --window "Mail" cmdop os ui --click "Send" # find + click button by label

The accessibility tree powers the desktop’s UI inspector and skills like “click Send button”.

Running OS verbs on a remote machine

Pipe through cmdop connect exec:

cmdop connect exec mac-studio -- cmdop os screenshot --out /tmp/mac.png cmdop connect exec mac-studio -- cmdop os clipboard cmdop connect exec mac-studio -- cmdop os launch "Visual Studio Code"

The OS automation runs on mac-studio; output lands on mac-studio. Pull files back with cmdop files get mac-studio:/tmp/mac.png ./mac.png.

On a headless server cmdop os will mostly fail — no display, no GUI apps. Use it on macOS desktops, Linux desktops with a display, and Windows.

Combining with chat

The agent inside cmdop chat can call most of these as tools when the persona allows. To debug what a chat session has access to:

cmdop chat --debug-prompt 2>&1 | grep -A 5 'os_'

Permissions on remote OS calls

When vps-bmw asks mac-studio to run cmdop os screenshot, the call is gated by mac-studio’s permission ruleset. Lock down with patterns like:

cmdop permissions deny 'execute_command(cmdop os clipboard *)' cmdop permissions ask 'execute_command(cmdop os script *)'

See ./permissions.

Last updated on