Chat Modes — ProxifAI Docs

ProxifAI’s AI chat ships four modes — ask, plan, code, and build. The mode chosen at the start of a turn decides which tools the model can call, how many tool-loop iterations it gets, and whether a development container is provisioned. The implementation lives in internal/knowledgebase/api/handlers/tool_registry.go (ToolsForMode) and internal/workflows/chat_agent_executor.go.

Mode comparison

	`ask`	`plan`	`code`	`build`
Knowledge base search	✅	✅	✅	✅
MCP platform tools (issues, projects, agents)	✅	✅	✅	✅
Code intelligence + platform graph	✅	✅	✅	✅
Cloud tools (clusters, workloads, logs)	✅	✅	✅	✅
Workflow read	✅	✅	✅	✅
Forge read (files, branches, commits, PR diff)	—	✅	✅	✅
Forge write (`create_pull_request`)	—	—	✅	✅
Workflow write (`execute_workflow`)	—	—	✅	✅
Instance tools (shell, file, git, browser, screenshot)	—	—	✅	✅
Max tool-loop iterations	10	10	25	25
Max tokens per LLM call	4 096	4 096	8 192	8 192
Provisions a runtime container	—	—	✅	✅

The only structural difference between code and build is convention — both have identical tool access and budgets in the OSS build. Use code for typical chat-driven development and build when you want the conversation log and UI to reflect “I’m shipping a feature” semantics.

`ask` — Q&A over your workspace

Read-only mode focused on retrieval. The model has the knowledge base, code-intelligence graph, MCP platform tools (list_issues, get_project, list_agents, …), and cloud tools — enough to answer questions that span code, planning, and infrastructure without touching anything.

Examples:

“Where do we set the database connection pool size?”
“What issues are blocking the v2 release?”
“How does the auth middleware handle expired tokens?”

Citations come back as a structured sources field on the response that the chat UI renders alongside the message — every claim links back to a file, issue, PR, or doc.

`plan` — Read-only analysis with full code visibility

Adds the read-only forge surface to ask’s tool palette: list_repositories, get_file, list_files, list_branches, list_commits, list_pull_requests, get_pull_request_diff. The model can browse the codebase like a reviewer would but cannot push, write files, or run commands.

Examples:

“Review the architecture of our notification system and identify potential bottlenecks”
“Which open issues are highest risk for the upcoming release? Skim the diffs and tell me which PRs likely break things.”
“Compare how rate limiting is implemented in payments/ vs billing/”

This is the right mode for code review, audit, summarization, and triage workflows where you want the model grounded in current code without giving it write access.

`code` — Live development inside a container

Adds the write surface and a real runtime: every code-mode session provisions a container (default per_execution with the base image), checks out the target repo into /workspace, and exposes 10 instance tools the model can call:

Tool	Effect
`exec_command`	Run a shell command in `/workspace`; returns stdout, stderr, exit code
`read_file` · `write_file` · `list_directory`	File ops scoped to the container
`git_diff` · `git_commit` · `git_push`	Standard git plumbing
`screenshot` · `computer`	Desktop GUI agent — clicks, keystrokes, scroll, screenshots
`browser_action`	Playwright-driven browser (Chromium): navigate, click, fill, evaluate

Plus create_pull_request (forge write) and execute_workflow. The combination is enough for the chat agent to clone, edit, test, commit, push, and open a PR — all in one turn.

You: "Add input validation to the user registration endpoint and write tests"
  → code mode dispatches a per_execution container
  → reads the existing handler with read_file
  → exec_command runs go test to capture baseline
  → write_file edits handler.go and adds validation_test.go
  → exec_command runs go test again to confirm green
  → git_commit / git_push / create_pull_request

Tool calls stream live to the UI, including terminal output and (for computer/browser_action) screenshots between steps.

`build` — Same tools as `code`, different framing

Identical capabilities to code. Use it when the conversation is about shipping a feature end-to-end rather than ad-hoc tinkering — the UI surfaces a “build session” view with linked PRs and pipeline runs, and the conversation history is filterable separately from one-off code-mode chats. The 25-iteration budget is enough for most multi-file features without splitting into multiple turns.

How a turn runs

1. Frontend posts {mode, query, conversationId, model}
2. Chat handler calls ToolRegistry.ToolsForMode(mode) for the palette
3. Knowledge base retrieval runs first — sources are streamed back via SSE
4. ChatAgentExecutor runs a streaming tool-call loop:
     for iter in 1..max_iter:
       LLM streams response (with tools attached)
       if no tool calls → finish
       for each tool call:
         ToolRegistry.Execute(mode, name, args, runtimeID)
       feed tool results back into the conversation
5. Final assistant message + sources + iteration trace persist to the DB

Iteration trace (per-iteration LLM tokens, tool calls, durations) is stored on the message so the UI can render the “agent steps” expander on each chat reply. Token usage feeds the LLM gateway usage stream.

Switching modes mid-conversation

Mode is a per-turn property — you can switch at any time and the new turn’s tool palette + iteration budget take effect immediately. The conversation history carries forward; the same conversationId keeps things stitched in the UI. Common pattern:

Start in ask to scope the problem
Move to plan to review the relevant files and PRs
Switch to code to actually make changes