Execution modes
How workflows actually run — three container modes (per_execution, per_workflow, shared), workflow-type defaults, and the configuration that controls them.
Every workflow executes inside a container. Three runtime modes trade off cold-start latency against isolation and resource use. The mode is set per workflow (with sensible defaults per workflow type), and a single broker — RuntimeBroker in internal/workflows/runtime_broker.go — picks the right provider at acquire time.
There is no “host mode” — workflows never run directly on the API process. If no container runtime is available, Acquire returns an error.
The three modes
| Mode | Container lifetime | Cold start | Isolation | Use it for |
|---|---|---|---|---|
per_execution | One container per run, torn down on Release | High (full create + start) | Highest — fresh filesystem every time | Agents, CI/CD, untrusted code, anything that mutates /workspace |
per_workflow | Long-lived containers per workflow ID, with N replicas | One-time on first run | Medium — /workspace/<execID> per run inside a shared container | Frequently-triggered workflows, pre-installed toolchains |
shared | One pool of long-lived containers shared across all workflows | One-time at boot | Lower — same container serves many workflows | Chat handlers, lightweight orchestration |
In all three modes, each execution gets its own logical workspace; the difference is whether the container itself persists.
per_execution — fresh container per run
The default for agent, ci, and automation workflow types. On Acquire:
- Create a container named
wf-<execID>from the configured image. - Mount a named volume
wf-<execID>-wsat/workspaceso sidecars (nodes that need a different image) can share files. - Optionally mount the Docker socket (
docker_socket: true) and publish container ports to the host (publish_ports: [22, …]) for SSH/preview workloads. - Run an idle command (
sh -c "mkdir -p /workspace && sleep infinity") so the runtime candocker execinto it.
On Release, the container is stopped and removed. The named volume goes away with it.
per_workflow — long-lived per-workflow replicas
The container is created on first execution and persists. Subsequent executions of the same workflow round-robin across replicas containers. Workspace isolation is achieved by giving each execution its own /workspace/<execID> subdirectory inside the shared container — there’s no rebuild between runs, but writes from one execution don’t appear in another’s working dir.
A workflow’s pool scales up automatically when its replicas count is increased and you trigger a new execution. The broker calls DestroyWorkflow(workflowID) to tear pools down on demand.
shared — process-wide pool
The default for chat workflow type. A single pool of replicas containers serves every workflow that opts into shared. Round-robin distributes load. Per-execution workspaces still apply (/workspace/<execID>) — the container’s filesystem is shared, but logical workspaces stay separate.
shared mode is best for workflows that don’t write to /workspace heavily and benefit from skipping the cold start entirely. The chat handler is the canonical example.
Defaults per workflow type
DefaultRuntimeConfigForType in runtime_broker.go sets the floor:
| Workflow type | Mode | Image | CPU | Memory |
|---|---|---|---|---|
agent | per_execution | base | 2 | 2Gi |
ci | per_execution | base | 2 | 1Gi |
chat | shared | base | 1 | 512Mi |
automation (default) | per_execution | base | 1 | 512Mi |
Per-workflow runtime_config (a JSONB column on the workflow row) overrides any field on top of the defaults. Empty values fall through to the defaults rather than zeroing them out.
Configuration
A workflow’s runtime_config is stored as JSON. The full surface:
{
"mode": "per_execution",
"image": "claude-code",
"cpu": "4",
"memory": "4Gi",
"replicas": 3,
"docker_socket": false,
"permissions": ["forge:read", "issues:write"]
}
| Field | Type | Meaning |
|---|---|---|
mode | per_execution | per_workflow | shared | Defaults to per_execution; type-specific defaults apply when omitted |
image | string | Short name (base, dev-node, claude-code) → resolved to ghcr.io/proxifai/agent-images/<name>:latest. A / or : makes it a full reference, used as-is. |
cpu, memory | strings | Cgroup limits (2, 512Mi, 4Gi) |
replicas | int | Pool size for per_workflow and shared (≥1) |
docker_socket | bool | Mount /var/run/docker.sock into the container — required for nodes that build images or run docker themselves |
permissions | string[] | Scopes baked into the pfai_<execID> token injected as PFAI_TOKEN. The CLI’s calls inherit these. |
Sidecars (multi-image workflows)
Nodes can request a different image than the main runtime via AcquireSidecar(ctx, execID, nodeID, image). The broker creates one container per (execID, image) pair on first request, mounts the same wf-<execID>-ws volume at /workspace, and reuses it for every subsequent node that asks for the same image. All sidecars are torn down by ReleaseSidecars when the workflow execution completes.
This makes “build with dev-node, deploy with dev-go” workflows fast — the second node hits a warm container instead of a cold start.
Pre-built agent images
The ghcr.io/proxifai/agent-images/ registry holds the images referenced by short names. Source is at agent-images/ in the platform repo.
| Short name | Tag | What’s inside |
|---|---|---|
base | :latest | Foundation image — sshd, tmux auto-attach, common shell utilities; entrypoint clones the target repo and runs the workflow script in tmux |
dev-node, dev-go, dev-python, dev-rust | :latest | base + a language toolchain |
aider | :latest | Aider coding agent |
claude-code | :latest | Anthropic’s Claude Code CLI |
copilot | :latest | GitHub Copilot CLI |
cursor | :latest | Cursor in a headless environment |
gemini-cli | :latest | Google’s Gemini CLI |
opencode | :latest | OpenCode coding agent |
To use a custom image, set runtime_config.image to a fully-qualified reference (my-registry.example.com/my-image:tag) and ensure it follows the entrypoint contract: idle on sleep infinity (the broker handles docker exec) or run your own loop that respects PFAI_TOKEN / PFAI_SERVER / PFAI_EXECUTION_ID env vars.
Auto-injected environment
Every container started by the broker receives:
| Env var | Source |
|---|---|
PFAI_TOKEN | HMAC-signed pfai_<execID>_<sig> — see AI Gateway auth |
PFAI_SERVER | Server base URL (BASE_URL from main config) |
PFAI_EXECUTION_ID | Current execution ID — picked up automatically by pfai exec* |
PROXIFAI_GIT_TOKEN | Forge clone token, scoped to the workflow’s repo |
LOCAL_ENDPOINT | Local LLM endpoint when configured (Ollama, etc.) |
These are set in the parent context via ContextWithExecEnv and merged into the container’s env block at create time.
Choosing a mode
┌─ writes to /workspace? ──► per_execution
Workflow has nodes that ─┤
└─ stateless / orchestration only ─┐
▼
fired more than ─► per_workflow
a few times/min (with replicas)
│
rarely fired ─► per_execution
(cold start is OK)
Chat / lightweight per-message handlers ──────────────────► shared
The defaults are good for most cases. Reach for per_workflow only when the cold start of per_execution is dominating your wall-clock time and the workflow is safe to run in a long-lived container.
See also
Live streaming, history, approvals, retries, artifacts, and cost tracking — what happens after Acquire returns a handle.
Agent-specific concerns: terminal/VNC streaming, port forwarding, supported runtimes.
Pipelines reuse the same runtime broker — they always run in per_execution mode by default.
Where the broker sits in the single-binary topology.