Code execution & sandbox

When the assistant runs a shell command, a Python cell, or a build, it does so inside a managed sandbox — not on the host serving the API. This page documents where that execution happens (backends), how isolated it is, and the exact tools the agent uses, with their parameters and return shapes.

Backend selection is platform-managed

Unlike caller-selectable backends in some agents, the execution backend here is chosen by the platform per deployment, not by the API request. You write the same tool calls regardless of backend; the isolation tier is an operational guarantee, not a parameter.

Execution backends

The active backend is configured via the SANDBOX_BACKEND setting (a comma-separated fallback list; the first entry is primary). Every backend exposes the same tool surface — only the isolation and persistence characteristics differ.

BackendWhat it isIsolationUse case
e2bManaged cloud sandbox (default)Full VM isolation, per-session filesystemDefault for hosted runs
firecrackerFirecracker microVMHardware-virtualized microVM, UFFD memory, pooled networkingStrongest isolation, fast cold-start
superserveManaged agents-SDK runtimeProvider-managed containerManaged scaling
blaxelManaged agents-SDK runtimeProvider-managed containerManaged scaling
localThis hostWorkspace bind-jail + git staging jailTrusted self-host / dev only

Isolation & containment

Containerized and microVM backends run under a containment profile, not a bare shell:

  • Filesystem jail — file access is confined to the session workspace; path traversal outside the root is rejected before any I/O.
  • Egress proxy — outbound network goes through a controlled egress path rather than raw host networking.
  • Per-session filesystem — each session gets its own workspace; installed packages and cwd changes persist across tool calls within a session.
  • microVM tier — on firecracker, the workload runs in a hardware-virtualized microVM with its own kernel, isolated from the host and from other sessions.

local backend is not a sandbox

SANDBOX_BACKEND=local runs on the host with only a workspace bind-jail and git staging jail. Use it for trusted self-hosted or development setups — never for untrusted input.

Tools

sandbox_bash — run shell / Python

One-shot command execution in the sandbox. Returns captured streams plus any files written to the workspace.

FieldTypeDescription
commandrequiredstringShell command to execute.
descriptionstringShort human-readable label of what the command does, e.g. 'Installing dependencies'.
timeout_secondsint (1–3600)Command timeout. Default 60s; raise for long installs, builds, or data exports.
cwdstringWorking directory (e.g. /workspace). Default: sandbox home.

Returns:

FieldMeaning
stdout / stderrCaptured output streams.
exit_codeProcess exit code.
truncatedtrue if output was clipped to the size limit.
files / file_countFiles produced in the workspace by this command.
local_pathsMap of workspace files to retrievable paths.

repl_execute — persistent Python REPL

A Jupyter-like Python session that keeps state across calls, with built-in helpers (peek, grep, chunk_indices, llm_query, FINAL) for working over large data without re-loading it each turn.

exec_command — interactive terminal (PTY)

Runs a command in a PTY session. Commands that finish return stdout + exit_code directly; long-lived processes (servers, REPLs, watchers) stay alive and return a session_id you keep feeding via write_stdin.

FieldTypeDescription
commandrequiredstringCommand to run in a PTY.
rowsint (1–1000)PTY rows. Default 24.
colsint (1–1000)PTY columns. Default 80.
grace_secondsfloatHow long to wait for a quick command to finish before returning a live session_id.

Returns session_id, pid, stdout, cursor, status, exit_code, alive — the cursor lets the next read pick up exactly where the last one left off.

write_stdin — drive a live session

Sends characters to the stdin of a still-running exec_command session and reads fresh output. Append \n to submit a line.

FieldTypeDescription
session_idrequiredstringSession id from a prior exec_command that is still alive.
inputrequiredstringCharacters to write to stdin. Append \n to submit a line.
sinceintCursor from the previous read; returns only output produced after it.
wait_secondsfloatHow long to wait for new output before returning.

edit_file — surgical file edits

Search-and-replace edits in sandbox files, with a fuzzy-matching fallback when the target text has drifted slightly.

bsl_check — 1C (BSL) static analysis

Runs bsl-language-server over 1C code. Point it at a .bsl / .os file or a configuration-export folder; returns diagnostics.

Full tool list

This page covers the code-execution tools. For the complete catalog across all categories (web, memory, connectors, media, …) see Agents → Available tools.

Was this page helpful?