Agent Sandbox
The sandbox is the execution boundary for session-attached resources and tool work. Use this page when you want to answer:- what the execution hand can actually do
- how mount boundaries affect file and shell work
- why the shell surface is bounded the way it is
- what the sandbox owns versus what tools own
Why this abstraction matters
The sandbox should be replaceable. Today it is local-only. Later it may be:- a remote container
- a cloud worker
- another isolated executor
provision and execute, that swap stays tractable.
Current implementation
The current implementation is:- local
- workspace-backed
- resource-aware
- path-scoped by provisioned mounts
list_dirread_textwrite_textappend_textreplace_textmkdirstatfind_entriesglob_entriesgrep_textrun_commandrun_shellinspect_persistent_shellopen_persistent_shellexec_persistent_shellclose_persistent_shellinspect
/workspace/workspace/agent/runtime/memory/learnings/vaults/<name>
.openboa-runtime/ guides inside /workspace:
session-runtime.mdsession-runtime.jsonmanaged-tools.jsonpermissions.jsonenvironment.jsonagent-setup.jsonagent-setup.mdskills.jsonvaults.jsonsession-status.jsonoutcome.jsonoutcome-grade.jsonoutcome-grade.mdoutcome-evaluation.jsonoutcome-evaluation.mdoutcome-evaluations.jsonoutcome-evaluations.mdoutcome-repair.mdevent-feed.jsonevent-feed.mdwake-traces.jsonshell-state.jsonshell-history.jsonshell-history.mdshell-last-output.jsonshell-last-output.md
run_command is a bounded non-shell command hand.
It runs with shell: false, a working directory inside a writable mounted root such as /workspace, and a timeout.
It also receives only a minimal explicit environment instead of inheriting the full host process environment.
When vault mounts are involved, run_command is further restricted to structure-only commands such as pwd and ls so the runtime does not dump secret file contents through ordinary command output.
run_shell is the writable one-shot shell hand.
It is restricted to writable execution mounts such as /workspace, runs under an advisory write lease, and is meant to sit behind a higher-level confirmation policy.
The runtime also now exposes a session-scoped persistent shell process through:
inspect_persistent_shellopen_persistent_shellexec_persistent_shellclose_persistent_shell
shell_openshell_restartshell_execshell_close
- bounded read-only command inspection via
run_command - permission-gated one-shot writable shell composition via
run_shell - permission-gated multi-step shell continuity via the persistent shell actions
commandId values, so future commands can continue from the last chosen cwd without losing immediate shell continuity.
The persistent shell metadata is also mirrored into .openboa-runtime/shell-state.json, .openboa-runtime/shell-history.md, and .openboa-runtime/shell-last-output.md, while shell_describe prefers live process introspection for the current session so stale runtime memory does not misreport an already-closed shell.
That live metadata now includes whether the shell is busy and which currentCommand is still running, so the runtime can treat a long-lived shell command as an active hand instead of guessing from stale summaries.
When the live shell is busy, shell_describe also returns a busyPlan that points the model at shell_wait first, exposes shell_read_last_output as the bounded evidence step, and enumerates the safe read-first tools that can still run while the shell mutation hand is occupied.
That same live metadata now includes partial stdout/stderr preview for the running command, so read-first inspection can use current evidence instead of only durable last-command state.
If the live process is gone, shell_describe now returns a recoveryPlan that tells the model whether it should call shell_restart or shell_open next.
glob_entries is the low-level path-matching primitive behind the higher-level glob managed tool.
grep_text also supports optional regex mode for the higher-level grep managed tool.
sandbox_describe should be treated as the source of truth for the hand’s current contract.
It returns:
- mounted resources
- explicit constraints
- action-level access hints
- command policy details such as the allowlisted read-only commands and exposed environment keys
- file and directory listing or reads (
ls,cat,head,tail,wc) - workspace/path introspection (
pwd,basename,dirname,realpath) - environment inspection (
env)
/workspace/agent is intentionally read-only through normal sandbox actions.
If a durable substrate file should be edited, stage it into /workspace with resources_stage_from_substrate, compare it with resources_compare_with_substrate, then promote the chosen result back with resources_promote_to_substrate.
The shared learnings mount at /memory/learnings is also read-only through the sandbox hand.
Cross-session learnings should be promoted through the runtime learning capture path, not rewritten ad hoc through filesystem actions.
That is enough for the current session-first runtime frontier.
The important mental model is:
- tools describe intent and policy
- the sandbox performs bounded execution inside the current environment and mounts
Relationship to resources
The sandbox does not invent resources. It receivesResourceAttachment[] and provisions them.
That keeps the boundary clear:
- session owns the list of attached resources
- environment defines the reusable execution substrate
- sandbox turns those into an executable runtime context
Relationship to tools
The sandbox is not the tool registry. Tools define:- name
- description
- schema
- ownership
- permission policy
Current non-goals
The current sandbox layer does not yet try to be:- a PTY-backed terminal emulator
- a cloud container runtime
- a full network policy engine
- a secret distribution system
- a multi-tenant production isolation layer