Execution modes — ProxifAI Docs

Every workflow executes inside a container. Three runtime modes trade off cold-start latency against isolation and resource use. The mode is set per workflow (with sensible defaults per workflow type), and a single broker — RuntimeBroker in internal/workflows/runtime_broker.go — picks the right provider at acquire time.

There is no “host mode” — workflows never run directly on the API process. If no container runtime is available, Acquire returns an error.

The three modes

Mode	Container lifetime	Cold start	Isolation	Use it for
`per_execution`	One container per run, torn down on Release	High (full create + start)	Highest — fresh filesystem every time	Agents, CI/CD, untrusted code, anything that mutates `/workspace`
`per_workflow`	Long-lived containers per workflow ID, with N replicas	One-time on first run	Medium — `/workspace/<execID>` per run inside a shared container	Frequently-triggered workflows, pre-installed toolchains
`shared`	One pool of long-lived containers shared across all workflows	One-time at boot	Lower — same container serves many workflows	Chat handlers, lightweight orchestration

In all three modes, each execution gets its own logical workspace; the difference is whether the container itself persists.

`per_execution` — fresh container per run

The default for agent, ci, and automation workflow types. On Acquire:

Create a container named wf-<execID> from the configured image.
Mount a named volume wf-<execID>-ws at /workspace so sidecars (nodes that need a different image) can share files.
Optionally mount the Docker socket (docker_socket: true) and publish container ports to the host (publish_ports: [22, …]) for SSH/preview workloads.
Run an idle command (sh -c "mkdir -p /workspace && sleep infinity") so the runtime can docker exec into it.

On Release, the container is stopped and removed. The named volume goes away with it.

`per_workflow` — long-lived per-workflow replicas

The container is created on first execution and persists. Subsequent executions of the same workflow round-robin across replicas containers. Workspace isolation is achieved by giving each execution its own /workspace/<execID> subdirectory inside the shared container — there’s no rebuild between runs, but writes from one execution don’t appear in another’s working dir.

A workflow’s pool scales up automatically when its replicas count is increased and you trigger a new execution. The broker calls DestroyWorkflow(workflowID) to tear pools down on demand.

`shared` — process-wide pool

The default for chat workflow type. A single pool of replicas containers serves every workflow that opts into shared. Round-robin distributes load. Per-execution workspaces still apply (/workspace/<execID>) — the container’s filesystem is shared, but logical workspaces stay separate.

shared mode is best for workflows that don’t write to /workspace heavily and benefit from skipping the cold start entirely. The chat handler is the canonical example.

Defaults per workflow type

DefaultRuntimeConfigForType in runtime_broker.go sets the floor:

Workflow type	Mode	Image	CPU	Memory
`agent`	`per_execution`	`base`	`2`	`2Gi`
`ci`	`per_execution`	`base`	`2`	`1Gi`
`chat`	`shared`	`base`	`1`	`512Mi`
`automation` (default)	`per_execution`	`base`	`1`	`512Mi`

Per-workflow runtime_config (a JSONB column on the workflow row) overrides any field on top of the defaults. Empty values fall through to the defaults rather than zeroing them out.

Configuration

A workflow’s runtime_config is stored as JSON. The full surface:

{
  "mode": "per_execution",
  "image": "claude-code",
  "cpu": "4",
  "memory": "4Gi",
  "replicas": 3,
  "docker_socket": false,
  "permissions": ["forge:read", "issues:write"]
}

Field	Type	Meaning
`mode`	`per_execution` \| `per_workflow` \| `shared`	Defaults to `per_execution`; type-specific defaults apply when omitted
`image`	string	Short name (`base`, `dev-node`, `claude-code`) → resolved to `ghcr.io/proxifai/agent-images/<name>:latest`. A `/` or `:` makes it a full reference, used as-is.
`cpu`, `memory`	strings	Cgroup limits (`2`, `512Mi`, `4Gi`)
`replicas`	int	Pool size for `per_workflow` and `shared` (≥1)
`docker_socket`	bool	Mount `/var/run/docker.sock` into the container — required for nodes that build images or run docker themselves
`permissions`	string[]	Scopes baked into the `pfai_<execID>` token injected as `PFAI_TOKEN`. The CLI’s calls inherit these.

Sidecars (multi-image workflows)

Nodes can request a different image than the main runtime via AcquireSidecar(ctx, execID, nodeID, image). The broker creates one container per (execID, image) pair on first request, mounts the same wf-<execID>-ws volume at /workspace, and reuses it for every subsequent node that asks for the same image. All sidecars are torn down by ReleaseSidecars when the workflow execution completes.

This makes “build with dev-node, deploy with dev-go” workflows fast — the second node hits a warm container instead of a cold start.

Pre-built agent images

The ghcr.io/proxifai/agent-images/ registry holds the images referenced by short names. Source is at agent-images/ in the platform repo.

Short name	Tag	What’s inside
`base`	`:latest`	Foundation image — sshd, tmux auto-attach, common shell utilities; entrypoint clones the target repo and runs the workflow script in tmux
`dev-node`, `dev-go`, `dev-python`, `dev-rust`	`:latest`	`base` + a language toolchain
`aider`	`:latest`	Aider coding agent
`claude-code`	`:latest`	Anthropic’s Claude Code CLI
`copilot`	`:latest`	GitHub Copilot CLI
`cursor`	`:latest`	Cursor in a headless environment
`gemini-cli`	`:latest`	Google’s Gemini CLI
`opencode`	`:latest`	OpenCode coding agent

To use a custom image, set runtime_config.image to a fully-qualified reference (my-registry.example.com/my-image:tag) and ensure it follows the entrypoint contract: idle on sleep infinity (the broker handles docker exec) or run your own loop that respects PFAI_TOKEN / PFAI_SERVER / PFAI_EXECUTION_ID env vars.

Auto-injected environment

Every container started by the broker receives:

Env var	Source
`PFAI_TOKEN`	HMAC-signed `pfai_<execID>_<sig>` — see AI Gateway auth
`PFAI_SERVER`	Server base URL (`BASE_URL` from main config)
`PFAI_EXECUTION_ID`	Current execution ID — picked up automatically by `pfai exec*`
`PROXIFAI_GIT_TOKEN`	Forge clone token, scoped to the workflow’s repo
`LOCAL_ENDPOINT`	Local LLM endpoint when configured (Ollama, etc.)

These are set in the parent context via ContextWithExecEnv and merged into the container’s env block at create time.

Choosing a mode

                         ┌─ writes to /workspace? ──► per_execution
Workflow has nodes that ─┤
                         └─ stateless / orchestration only ─┐
                                                            ▼
                                          fired more than ─► per_workflow
                                          a few times/min     (with replicas)
                                                            │
                                          rarely fired   ─► per_execution
                                                            (cold start is OK)

Chat / lightweight per-message handlers ──────────────────► shared

The defaults are good for most cases. Reach for per_workflow only when the cold start of per_execution is dominating your wall-clock time and the workflow is safe to run in a long-lived container.